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Methods and Compositiom for RNA Interference 



Governmeiit Snpp nrf 

Work described herein was supported by National Institutes of Health Grant ROl- 
5 GM62534. The United States Goveniment may have certain rights in the invention. 



Background of the Invention 

"KNA interference", "post-transcriptional gene silencing", "quelling" — these 
different names describe similar effects that result from the overexpression or 

10 misexpression of transgenes, or from the deliberate introduction of double-stranded RNA 
into cells (reviewed in Fke A (1999) 2iaidsG^15:358-363; Sharp PA (1999) Gem 
Dev 13:139-141; Hunter C (1999) CurrBiol 9:R44a-R442; Baulcombe DC (1999) Curr 
Biol 9:R599-R601; Vaucheiet et al. (1998) PlanU 16:651-659). The injection of double- 
stranded RNA into the nematode Caenorhabditis elegans, for example, acts systemically 

15 to cause the post-transcriptional depletion of the homologous endogenous RNA (Fire et al. 
(1998) Nafcire 391: 806-811; and Montgomery et al. (1998) PNAS 95:15502-15507). 
RNA interference, commonly referred to as RNAi, offers a way of specifically and 
potently mactivating a cloned gene, and is proving a powerful tool for investigating gene 
function. But the phenomenon is interesting in its own right; the mechanism has been 
20 rather mysterious, but recent research — the latest reported by Smardon et al. (2000) Curr 
Biol 10:169-178— is beginnmg to shed light on the nature and evolution of the biological 
processes tihat underlie RNAi. 

RNAi was discovered when researchers attempting to use the antisense RNA 
approach to inactivate a C. elegans gene found that mjection of sense-strand RNA was 
actually as effective as the antisense RNA at inhibiting gene function. Guo et al. (1995) 
CeU 81:61 1-620. Further mvestigation revealed that the active agent was modest amounts 
of double-stranded RNA that contaminate m vitro RNA preparations. Researchers quickly 
determined the 'rules' and effects of RNAi. Exon sequences are required, whereas introns 
and promoter sequences, while ineffective, do not appear to compromise RNAi (though 
there may be gene-specific exceptions to this rule). RNAi acts systemically — mjection 
into one tissue inhibits gene function m cells throughout the animal. The results of a 
variety of experunents, in C. elegans and other organisms, indicate that RNAi acts to 
destabilize cellular RNA after RNA processing. 
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The potency of RNAi inspired Timmons and Fire (1 998 Nature 395: 854) to do a 
simple experiment that produced an astonishing result. Hiey fed to nematodes bacteria that 
had been engineered to express double-stranded RNA corresponding to the C. elegam 
mc-22 gene. Amazingly, these nematodes developed a phenotype similar to that oiunc-22 
5 mutants that was dependent on their food source. The ability to conditionally expose lai^e 
numbers of nematodes to gene-specific double-stranded RNA formed the basis for a very 
powerful screen to select for RNAi-defective C. elegam mutants and then to identity the 
corresponding genes. 

Double-stranded RNAs (dsRNAs) can provoke gene silencing in numerous m Yivo 
contexts mcluding Drosophila, Caemrhabditis elegam, planaria, hydra, tiypanosomes, 
fimgi and plants. However, the ability to recapitulate this phenomenon in higher 
eukaiyotes, particularly mammalian cells, has not be accomplished in the art. Nor has the 
prior art demonstrated that this phenomena can be observe in cultured eukaryotes cells. 

15 Summary of the Inventinn 

One aspect of the present invention provides a method for attenuating expression 
of a target gene in a non-embryonic cell suspended in culture, comprising introducing into 
the cell a double stranded RNA (dsRNA) in an amount sufficient to attenuate expression 
of the target gene, wherein the dsRNA comprises a nucleotide sequence that hybridizes 
20 under stringent conditions to a nucleotide sequence of the target gene. 

Another aspect of the present invention provides a method for attenuating 
expression of a target gene m a mammalian cell, comprising 

(i) activating one or both of a Dicer activity or an Argonaut activity m the ceU. 



and 



25 (ii) 



introducing into the cell a double stranded RNA (dsRNA) in an amount 
sufficient to attenuate expression of the target gene, wherein the dsRNA 
comprises a nucleotide sequence that hybridizes under stringent conditions to 
a nucleotide sequence of the target gene. 

In certem embodiments, the cell is suspended in culture; while in other embodiments the 
30 cell is in a whole animal, such as a non-human mammal. 

In certain preferred embodunents, the ceU is engineered with (i) a recombinant 
gene encodmg a Dicer activity, (ii) a recombmant gene encoding an Argonaut activity, or 
(iii) both. For mstance, the recombinant gene may encode, for a example, a protein which 
mcludes an ammo acid sequence at least 50 percent identical to SEQ ID No. 2 or 4; or be 
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defined by a coding sequence hybridizes under wash conditions of 2 x SSC at 22°C to 
SEQ ID No. 1 or 3. In certain embodiments, the recombinant gene may encode, for a 
example, a protein which includes an amino acid sequence at least 50 percent identical to 
the Argonaut sequence shown in Figure 24. 

5 In certain embodiments, rather than use a heterologous expression construct(s), an 

endogenous Dicer gene or Argonaut gene can be activated, e.g, by gene activation 
technology, expression of activated transcription factors or other signal transduction 
protein, which induces expression of the gene, or by treatment wia an endogenous factor 
which upregualtes the level of expression of the protein or inhibits the degradation of the 
10 protein. 

In certain preiferred embodiments, the target gene is an endogenous gene of the 
cell. In other embodiments, the target gene is an heterologous gene relative to the genome 
of the cell, such as a pathogen gene, e.g., a viral gene. 

In certain embodunents, the cell is treated with an agent that inhibits protein kinase 
RNA-activated (PKR) apoptosis, such as by treatment with agents which inhibit 
expression of PKR, cause its destruction, and/or inhibit the kinase activity of PKF. 

In certam preferred embodunents, the cell is a primate ceU, such as a human cell. 

In certain embodiments, the dsRNA is at least 50 nucleotides in length, and 
preferably 400-800 nucleotides in length. 

Still another aspect of the present invention provides an assay for identifying 
nucleic acid sequences responsible for conferring a particular phenotype in a cell, 
comprising 

(i) constructing a variegated Ubrary of nucleic acid sequences from a cell in an 
orientation relative to a promoter to produce double stranded DNA; 

25 (ii) introducing the variegated dsRNA library into a culture of target cells, 

which cells have an activated Dicer activity or Argonaut activity; 

m identifying members of the Ubrary which confer a particular phenotype on 
the cell, and identifying the sequence from a cell which correspond, such as being 
identical or homologous, to the library member. 
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Yet another aspect of the present invention provides a method of conducting a drug 
discovery business comprising: 
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(i) identifying, by the assay of claim 16, a target gene which provides a 
phenotypically desirable response when inhibited by RNAi; 

(ii) identifying agents by their ability to inhibit expression of the target gene or 
the activity of an expression product of the target gene; 

5 (iii) conducting therapeutic profiling of agents identified in step (b), or forther 

analogs thereof, for efficacy and toxicity in animals; and 

(iv) formulating a pharmaceutical preparation including one or more agents 
identified in step (iii) as having an acceptable therapeutic profile. 

The method may include an additional step of establishing a distribution system for 
distributing the pharmaceutical preparation for sale, and may optionally mclude 
establishing a sales group for marketing the pharmaceutical preparation. 

Another aspect of the present invention provides a method of conducting a target 
discovery business comprising: 

(i) identifying, by the assay of claim 16, a target gene which provides a 
1 5 phenotypically desirable response when inhibited by RNAi; 

(ii) (optionally) conducting therapeutic profiling of the target gene for efficacy 
and toxicity in animals; and 

(iii) . licensing, to a thfrd party, the rights for further drug development of 
inhibitors of the target gene. 

20 Another aspect of the mvention provides a method for inhibiting RNAi by 

inhibiting the expression or activity of an RNAi enzyme. Thus, the subject method may 
include inhibiting the acitivity of Dicer and/or the 22-mer RNA. 

Still anotiier aspect relates to the a method for altermg tiie specificity of an RNAi 
by modifying the sequence of the RNA component of the RNAi enzyme. 

25 Another aspect of the invention relates to purified or semi-purified preparations of 

the RNAi enzyme or components thereof. In certain embodiments, tiie preparations are 
used for identifying , compounds, especially small organic molecules, which mhibit or 
potentiate the RNAi activity. Small molecule inhibitors, for example," can be used to 
inhibit dsRNA responses in cells which are purposefully being transfected with a virus 

30 which produces double stranded RNA. 

The dsRNA construct may comprise one or more strands of polymerized 
ribonucleotide. It may include modifications to either the phosphate-sugar backbone or the 
nucleoside. Hie double-stranded structure may be formed by a single self-complementary 
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RNA strand or two complementary RNA strands. RNA duplex formation may be initiated 
either inside or outside the ceU. The dsRNA construct may be introduced in an amount 
which aUows delivery of at least one copy per ceU. ffigher doses of double-stranded 
material may yield more effective inhibition. Inhibition is sequence-specific in that 
5 nucleotide sequences corresponding to the duplex region of the RNA are targeted for 
genetic inhibition. dsRNA constructs containing a nucleotide sequences identical to a 
portion of the target gene is preferred for inhibition. RNA sequences with insertions, 
deletions, and smgle point mutations relative to the target sequence have also been found 
to be effective for inhibition. Thus, sequence identity may optimized by alignment 
1 0 algorithms known in the art and calculating the percent difference between liie nucleotide 
sequences. Alternatively, the duplex region of the RNA may be defined functionally as a 
nucleotide sequence that is capable of hybridizing with a portion of the. target gene 
transcript. 



15 Brief Description of the Drawings 

Figure 1: RNAi in S2 cells, a, Drosophila S2 cells were transfected with a plasmid 
fliat directs lacZ expression fi-om tiie copia promoter in combination wifli dsRNAs 
corresponding to either human CDS or lacZ, or with no dsRNA, as mdicated. b, S2 cells 
were co-transfected with a plasmid that directs expression of a GFP-US9 fusion protein 
20 (12) and dsRNAs of either lacZ or cyclin E, as indicated. Upper panels show FACS 
profiles of the bulk population. Lower panels show FACS profiles fi-om GFP-positive 
cells, c. Total RNA was extracted fi-om cells transfected with lacZ, cyclin E, fizzy or cyclin 
A dsRNAs, as indicated. Northern blots were hybridized witii sequences not present in the 
transfected dsRNAs. 

25 Figure 2: RNAi in vitro, a, Transcripts corresponding to either the first 600 

nucleotides of Drosophila cyclin E (E600) or the first 800 nucleotides oflacZ (Z800) were 
incubated in lysates derived from cells that had been transfected with either lacZ or cyclin 
E (cycE) dsRNAs, as indicated. Time points were 0, 10, 20, 30, 40 and 60 min for cyclin E 
and 0, 10, 20, 30 and 60 min for lacZ. b. Transcripts were mcubated in an extract of S2 

30 cells that had been ti-ansfected with cyclin E dsRNA (cross-hatched box, below). 
Transcripts corresponded to the first 800 nucleotides of lacZ or the first 600, 300, 220 or 
100 nucleotides of cyclin E, as indicated. Eout is a titmscript derived from the portion of 
the cyclin E cDNA not contained witiiin the transfected dsRNA. E-ds is identical to tiie 
dsRNA that had been transfected mto S2 cells. Time points were 0 and 30 min. 

35 c, Syntiietic transcripts complementary to the complete cyclin E cDNA (Eas) or the final 
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600 nucleotides (Eas600) or 300 nucleotides (Eas300) were incubated in e?ctract for 0 or 
30min. 

Figure 3: Substrate requirements of the RISC. Extracts were prepared from cells 
transfected witii cyclin E dsRNA. Aliquots were incubated for 30 min at 30 ^'C before the 
5 addition of either the cyclin E (E600) or lacZ (Z800) substrate. Individual 20-ttl aliquots, 
as indicated, were pre-incubated with 1 mM CaCl 2 and 5 mM EGTA, 1 mM CaCk, 5 mM 
EGTA and 60 U of micrococcal nuclease, 1 mM CaCb and 60 U of micrococcal nuclease 
or 10 U of DNase I (Promega) and 5 mM EGTA. After the 30-min pre-incubation, EGTA 
was added to those samples that'lacked it. Yeast tRNA (1 ^ig) was added to all samples. 
10 Time points were at 0 and 30 min. 

Figure 4: The RISC contains a potential , guide RNA. a, Northern blots of RNA 
from either a crude lysate or the SI 00 fraction (containing the soluble nuclease activity, 
see Methods) were hybridized to a riboprobe derived from the sense strand of the (^clin E 
mRNA. b, Soluble cvc//K-£-specific nuclease activity was fractionated as described m 
15 Methods. Fractions from the anion-exchange resm were incubated with the lacZ, control 
substrate (upper panel) or the cyclin E substrate (centre panel). Lower panel, RNA from 
each fraction was analysed by northern blottmg with a uniformly labelled transcript 
derived from sense strand of the cyclin E cDNA. DNA oligonucleotides were used as size 
markers. 

20 Figure 5: Generation of 22mers and degradation of mRNA are carried out by 

distinct enzymatic complexes. A. Extracts prepared either from 0-12 hour Drosophila 
embryos or Drosophila S2 cells (see Methods) were incubated 0, 15, 30, or 60 minutes 
(left to right) with a uniformly-labeled double-stranded RNA corresponding to the first 
500 nucleotides of the Drosophila cyclin E coding region. M indicates a marker prepared 

25 by in vitro transcription of a synthetic template. The template was designed to yield a 22 
nucleotide transcript. The doublet most probably results from improper initiation at the +1 
position. B. Whole-cell extracts were prepared from S2 cells that had been transfected 
with a dsRNA correspondmg to the first 500 nt. of the luciferase codmg region, SIO 
extracts were spun at 30,000xg for 20 minutes which represents our standard RISC 

30 extract^. SlOO extracts were prepared by fiirther centrifiigation of SIO extracts for 60 
minutes at 100,000xg. Assays for mRNA degradation were carried out as described 
previously^ for 0,30 or 60 mmutes (left to right m each set) with either a single-stranded 
luciferase mRNA or a single-stranded cyclin E mRNA, as indicated. C. SIO or SlOO 
extracts were incubated with cyclin E dsRNAs for 0, 60 or 120 minutes (L to R). 

35 Figure 6: Production of 22mers by recombinant CG4792/Dicer. A. Drosophila 

S2 cells were transfected with plasmids that direct the expression of T7-epitope tagged 
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versions of Drosha, CG4792/Dicer-1 and Homeless. Tagged proteins were purified from 
cell lys.ates by immunoprecipitation and were incubated with cyclin E dsRNA. For 
comparison, reactions were also perfomied in Drosophila embryo and S2 cell extracts. As 
a negative control, immunoprecipitates were prepared from cells transfected with a P- 
5 galactosidase expression vector. Pairs of lanes show reactions performed for 0 or 60 
minutes. The synthetic marker (M) is as described in the legend to Figure 1. B. 
Diagrammatic representations of the domain structures of CG4792/Dicer-1, Drosha and 
Homeless are shown. C. Immunoprecipitates were prepared from detergent lysates of S2 
cells using an antiserum raised against the C-terminal 8 amino acids of Drosophila Dicer- 1 

10 (CG4792). As conti-ols, similar preparations were made with a pre-immune serum and 
with an immune serum that had been pre-incubated with an excess of antigenic peptide. 
Cleavage reactions in which each of these precipitates was incubated with an -500 nt. 
fragment of Drosophila cyclin E are shown. For comparsion, an incubation of the 
substrate in Drosophila embryo extract was electrophoresed in parallel. D. Dicer 

15 immunoprecipitates were incubated with dsKNA substrates in the presence or absence of 
ATP. For comparison, the same substrate was mcubated with S2 extracts that either 
contained added ATP or that were depleted of ATP using glucose and hexokinase (see 
methods). E, DrosophUa S2 cells were transfected with uniformly, 32P-labeIled dsRNA 
corresponding to the first 500 nt of GFP. RISC complex was affinity purified using a 

20 histidine-tagged version of D.m. Ago-2, a recently identified component of the RISC 
complex (Hammond et al, in prep). RISC was isolated either under conditions in which it 
remains ribosome associated (Is, low salt) or under conditions that extract it from the 
ribosome in a soluble form (hs, high salt)^. For comparison, the spectrum of labelled 
RNAs in the total lysate is shown. F. Guide RNAs produced by incubation of dsRNA 

25 with a Dicer immunoprecipitate are compared to guide RNAs present in a afFmity-purified 
RISC complex. These precisely comigrate on a gel that has single-nucleotide resolution. 
The lane labelled control is an affinity selection for RISC from cell that had been 
transfected witfi labeled dsRNA but not with the epitope-tagged D.m. Ago-2. 

Figure 7: Dicer participates in RNAi. A. DrosophUa S2 cells were transfected 
30 with dsRNAs correspondmg to the two Drosophila Dicers (CG4792 and CG6493) or with 
a control dsRNA corresponding to murine caspase 9. Cytoplasmic extracts of these cells 
were tested for Dicer activity. Transfection with Dicer dsRNA reduced activity in lysates 
by 7.4-fold. B. The Dicer-1 antiserum (CG4792) was used to prepare immunoprecipitates 
from S2 cells that had been treated as described above. Dicer dsRNA reduced the activity 
35 of Dicer-1 in this assay by 6.2-fold. C. Cells that had been transfected two days 
previously with either mouse caspase 9 dsRNA or with Dicer dsRNA were cotransfected 
with a GFP expression plasmid and either control, luciferase dsRNA or GFP dsKNA. 



-7- 



wo 01/68836 



PCT/USOl/08435 



Three independent experiments were quantified by FACS. A comparison of the relative 
percentage of GFP-positive cells is shown for control (GFP plasmid plus luciferase 
dsRNA) or silenced (GFP plamsid plus GFP dsKNA) populations in cells that had 
previously been transfected with either control (caspase 9) or Dicer dsRNAs. 

5 Figure 8: Dicer is an evolutionarily conserved ribonuclease, A. A model for 

production of 22mers by Dicer. Based upon the proposed mechanism of action of 
Ribonuclease HI, we propose that Dicer acts on its substrate as a dimer. The positioning 
of the two ribonuclease domains (Rllla and Rlllb) within the enzyme would thus 
determine the size of the cleavage product. An equally plausible alternative model could 

10 be derived in which the Rllla and Rmb domains of each Dicer enzyme would cleave in 
concert at a single position. In this model, the size of the cleavage product would be 
determined by interaction between two neighboring Dicer enzymes. B. Comparison of 
the domain structures of potential Dicer homologs in various organisms {Drosophila - 
CG4792, CG6493, C elegans - K12H4,8, Arabidopsis - CARPEL FACT0RY2^ 

15 T25K16.4, AC012328_1, human HeUcase-MOP and S, pombe - YC9A_SCHPO). The 

27 

ZAP domains were identified both by analysis of individual sequences with Pfam and by 
Psi-blast^^ searches. The ZAP domam in the putative S, pombe Dicer is not detected by 
PFAM but is identified by Psi-Blast and is ttius shown m a different color. For 
comparison, a domain structure of the RDE1/QDE2/ARG0NAUTE family is shown. It 
20 should be noted that the ZAP domains are more similar within each of the Dicer and 
ARGONAUTE families than they are between the two groups. C. An alignment of the 
ZAP domains m selected Dicer and Argonaute family members is shown. The alignment 
was produced using ClustalW. 

Figure 9: Purification strategy for RISC, (second step in RNAi model). 

25 Figure 10: Fractionation of RISC activity over sizing column. Actvity fi*actionates 

as 500KD complex. Also, antibody to dm argonaute 2 cofi-actionates with activity. 

Figure 11-13: Fractionation of RISC over monoS, monoQ, Hydroxyapatite 
columns. Dm argonaute 2 protein also cofactionates. 

Figure 14: Alignment of dm argonaute 2 with other family members, 

30 Figure 15: Confirmation of dm argonaute 2. S2 cells were transfected with labeled 

dsRNA and His tagged argonaute. Argonaute was isolated on nickel agarose and RNA 
component was identified on 15% acrylamide gel. 

Figure 16; S2 cell and embryo extracts were assayed for 22mer generating activity. 
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Figure 17: RISC can be separated from 22mer generating activity (dicer). 
Spinning extracts (SlOO) can clear RISC activity from supernatant (left paael) however, 
SlOO spins still contain dicer activity (right panel). 

Figure 18: Dicer is specific for dsRNA and prefers longer substrates. 

5 Figure 19: Dicer was fractionated over several columns. 

Figure 20: Identification of dicer as enzyme which can process dsRNA into 
22mers. Various RNaseffl family members were expressed with n terminal tags, 
immunoprecipitated, and assayed for 22mer generating activity ( left panel). In right 
panel, antibodies to dicer could also precipitate 22mer generating activity. 

1 0 Figure 2 1 : Dicer requhes ATP. 

Figure 22: Dicer produces RNAs that are the same size as RNAs present in RISC. 

Figure 23: Human dicer homolog when expressed and immunoprecipitated has 
22mer generating activity. 

Figure 24: Sequence of dm argonaute 2. Peptides identified by microsequencmg 
15 are shown in underline. 

Figure 25: Molecular charaterization of dm argonaute 2. The presence of an intron 
in coding sequence was determined by northern blottmg usmg mtron probe. This results 
m a different 5' readmg frame that that published genome seqeunce. Number of 
polyglutame repeats was determmed by genomic PGR. 

20 Figure 26: Dicer activity can be created in human cells by expression of human 

dicer gene. Host cell was 293. Crude extracts had dicer activity, while activity was absent 
from untransfected cells. Activity is not dissimilar to that seen in drosophila embryo 
extracts.. 

Figure 27: An -500 nt fragment of the gene that is to be silenced (X) is inserted 
25 mto the modified vector as a stable dhect repeat usmg standard cloning procedures. 
Treatment with commercially available ere recombinase reverses sequences within the 
loxP sites (L) to create an mverted repeat. This can be stably mamtamed and amplified m 
an she mutant bacterial strain (DL759). Transcription in vivo from the promoter of choice 
(P) yields a hairpm RNA that causes silencing. A zeocin resistance marker is included to 
30 insure maintenance of the direct and mverted repeat structures; however this is non- 
essential in vivo and could be removed by pre-mRNA splicing if desbed. Smith, N. A. et 
al. Total silencmg by intron-spliced hairpm RNAs. Nature 407, 319-20 (2000). 
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Detailed Description of the Certain Prefe rred Emhndlmentq 

/. Overview 

The present invention provides methods for attenuating gene expression in a cell 
5 using gene-targeted double stranded RNA (dsRNA). The dsRNA contains a nucleotide 
sequence that hybridizes under , physiologic conditions of the cell to the nucleotide 
sequence of at least a portion of the gene to be inhibited (the **target" gene). 

A significant aspect to certain embodiments of the present invention relates to the 
demonstration in the present application that RNAi can in fact be accomplished in cultured 
10 cells, rather than whole organisms as decribed in the art. 

Another salient feature of the present invention concerns the ability to carry out 
RNAi in higher eukaryotes, particularly m non-oocytic cells of mammals, e.g., cells' from 
adult mammals as an example. 

As described m fur&er detail below, the present invention(s) are based on the 
15 discovery that the RNAi phenomenum is mediated by a set of enzyme activities, mcludmg 
an essential KNA component, that are evolutionarily conserved m eukaiyotes ranging 
from plants to mammals. 

One enzyme contams an essential RNA component After partial purification, a 
multi-component nuclease (herein 'TRISC nuclease") co-fractionates with a discrete, 22- 
20 nucleotide RNA species which may confer specificity to the nuclease through homology 
to the substrate mRNAs. The short RNA molecules are generated by a processing reaction 
from the longer input dsRNA. Without wishing to be bound by any particular theory, 
these 22mer guide RNAs may serve as guide sequences that instruct the RISC nuclease to 
destroy specific mRNAs correspondmg to the dsRNA sequences. 

25 The appended examples also identify an enzyme. Dicer, that can produce the 

putative guide RNAs. Dicer is a member of the RNAse HI family of nucleases that 
specifically cleave dsRNA and is evolutionarily conserved in worms, flies, plants, fungi 
and, as described herein, mammals. The enzyme has a distinctive structure which includes 
a helicase domain and dual RNAse m motifs. Dicer also contains a region of homology to 

30 the RDE1/QDE2/ARG0NAUTB family, which have been geneticaUy linked to RNAi in 
lower eukaiyotes. Indeed, activation of, or overexpression of Dicer may be sufficient in 
many cases to permit RNA interference in otherwise non-receptive cells, such as cultured 
eukaryotic cells, or mammalian (non-oocytic) cells in culture or in whole organisms. 
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In certain embodiments, the ceils can be treated with an agent(s) tiiat inhibits the 
double-stranded KNA-dependent protein known as PKR (protem kinase RNA-activated), 
Double stranded RNAs in mammalian cells typically activate protem kinase PKR that 
phosphorylates and inactivates erF2a (Fire (1999) Trends Genet 15:358). Hie ensuing 
5 inhibition of protein synthesis ultimately results in apoptosis. This sequence-mdependent 
response may reflect a form of primitive immune response, since the presence of dsRNA 
is a conmion feature of many viral lifecycles. However, as described herein, Applicants 
have demonstrated that the PKR response can be overcome in favor of the sequence- 
specific RNAi response. However, in certain instances, it can be desirable to treat the 
10 cells with agents which inhibit expression of PKR, cause its destruction, and/or inhibit the 
kinase activity of PKF are specifically contemplated for use m the present method. 
Likewise, overexpression of or agents which ectopic activate IF2a can be used. 

Thus, the present invention provides a process and compositions for inhibitmg 
expression of a target gene in a cell, expecially a mammalian cell. In certain embodiments, 

15 flie process comprises introduction of RNA (the "dsRNA construct") with partial or fiilly 
double-stranded character into the cell or into the extracellular environment. Inhibition is 
specific in that a nucleotide sequence from a portion of the target gene is chosen to 
produce the dsRNA construct. In preferred embodiments, the method utilizes a cell in 
which Dicer and/or Argonaute activities are recombmantly expressed or otherwise 

20 ectopically activated. This process can be (1) effective in attenuating gene expression, (2) 
specific to the targeted gene, and (3) general in allowmg inhibition of many different types 
of target gene. 

II. Definitions 

25 For convenience, certain terms employed in the specification, examples, and 

appended claims are collected here. 

As used herein, the term "vector" refers to a nucleic acid molecule capable of 
transporting another nucleic acid to that it has been Imked. One type of vector is a 
genomic mtegrated vector, or ^'integrated vector", which can become integrated mto the 
30 chromsomal DNA of the host cell. Another type of vector is an episomal vector, i.e., a 
nucleic acid capable of extra-chromosomal replication. Vectors capable of directing flie 
expression of genes to that they are operatively linked are referred to herein as "expression 
vectors". In the present specification, "plasmid" and "vector" are used interchangeably 
unless otherwise clear from the context. 
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As used herein, the term "nucleic acid" refers to polynucleotides such as 
deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term 
should also be understood to include, as applicable to Ihe embodiment being described, 
single-stranded (such as sense or antisense) and double-stranded polynucleotides. 

5 As used herein, the term "gene" or "recombinant gene" refers to a nucleic acid 

comprising an open readmg frame encodmg a polypeptide of the present invention, 
including both exon and (optionally) intron sequences. A "recombinant gene" refers to 
nucleic acid encoding such regulatory polypeptides, that may optionally include intron 
sequences that are derived from chromosomal DNA. The term "mtron" refers to a DNA 
10 sequence present in a given gene that is not translated mto protein and is generally found 
between exons. As used herein, the term "transfection" means the introduction of a 
nucleic acid, e.g., an expression vector, into a recipient cell by nucleic acid-mediated gene 
transfer. 

A "protein coding sequence" or a sequence that "encodes" a particular polypeptide 
15 or peptide, is a nucleic acid sequence that is transcribed (in the case of DNA) and is 
translated (in the case of mRNA) into a polypeptide m vitro or in vivo when placed under 
the control of appropriate regulatory sequences. The boundaries of the coding sequence 
are determined by a start codon at the 5' (amino) terminus and a translation stop codon at 
the 3' (carboxy) terminus. A coding sequence can include, but is not limited to, cDNA 
20 from procaiyotic or eukaryotic mRNA, genomic DNA sequences from procaryotic or 
eukaryotic DNA, and even synthetic DNA sequences. A transcription termination 
sequence will usually be located 3* to the coding sequence. 

Likewise, "encodes", unless evident from its context, will be meant to include 
DNA sequences that encode a polypeptide, as the term is typically used, as well as DNA 
25 sequences that are transcribed into inhibitory antisense molecules. 

Hie term "loss-of-fianction", as it refers to genes mhibited by the subject RNAi 
method, refers a diminishment in the level of e3q)ression of a gene when compared to the 
level in the absense of dsRNA constructs. 

The term "expression" with respect to a gene sequence refers to transcription of the 
30 gene and, as appropriate, translation of the resultmg mRNA transcript to a protein. Thus, 
as will be clear from the context, expression of a protein coding sequence results from 
transcription and translation of the coding sequence. 

"Cells," "host cells" or "recombinant host cells" are terms used interchangeably 
herein. It is xmderstood that such terms refer not only to the particular subject cell but to 
35 tiie progeny or potential progeny of such a cell. Because certain modifications may occur 
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in succeeding generations due to either mutation or environmental influences, such 
progeny may not, in fact, be identical to the parent cell, but are still included within the 
scope of the term as used herein. 

By "recombinant virus" is meant a virus that has been genetically altered, e.g., by 
5 the addition or insertion of a heterologous nucleic acid construct into the particle. 

As xised herein, the terms "transduction" and "transfection" are art recognized and 
mean the mtroduction of a nucleic acid, e.g., an expression vector, mto a recipient cell by 
nucleic acid-mediated gene transfer, "Transformation", as used herein, refers to a process 
in which a cell's genotype is changed as a result of the cellular uptake of exogenous DNA 
10 or RNA, and, for example, the transformed cell expresses a dsKNA contruct. 

"Transient transfection" refers to cases where exogenous DNA does not integrate 
into the genome of a transfected cell, e.g., where episomal DNA is transcribed mto mRNA 
and translated into protein. 

A cell has been "stably transfected" with a nucleic acid construct when the nucleic 
15 acid construct is capable of being inherited by daughter cells. 

As used herein, a "reporter gene construct" is a nucleic acid that includes a 
"reporter gene" operatively linked to at least one transcriptional regulatory sequence. 
Transcription of the reporter gene is controlled by these sequences to which they are 
linked. The activity of at least one or more of these control sequences can be directly or 
20 mdirectly regulated by the target receptor protein. Exemplary transcriptional control 
sequences are promoter sequences. A reporter gene is meant to include a promoter- 
reporter gene construct that is heterologously expressed in a cell. 

As used herem, "transformed cells" refers to cells that have spontaneously 
converted to a state of unrestrained growth, i.e., they have acquired the ability to grow 

25 through an indefinite number of divisions m culture. Transformed cells may be 
characterized by such terms as neoplastic, anaplastic and/or hyperplastic, with respect to 
their loss of growth control. For purposes of this invention, the terms "transformed 
phenotype of malignant mammalian cells" and "transformed phenotype " are intended to 
encompass, but not be Ihnited to, any of the following phenotypic traits associated with 

30 cellular transformation of mammalian cells: inmiortalization, morphological or growth 
transformation, and tumorigenicity, as detected by prolonged growth in cell culture, 
growth in semi-solid media, or tumorigenic growth in immuno-incompetent or syngeneic 
animals. 

As used herein, "proliferating" and "proliferation" refer to cells undergoing 
35 mitosis. 
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As used herein, "immortalized cells" refers to cells that have been altered via 
chemical, genetic, and/or recombinant means such that the cells have the ability to grow 
through an indefinite number of divisions in culture. 

The "growth state" of a cell refers to the rate of proliferation of the cell and the 
5 state of differentiation of the cell. 

iZZ. Exemvlarv embodiments of Isolation Method 

One aspect of the invention provides a method for potentiating RNAi by induction 
or ectopic activation of an RNAi enzyme in a cell (in vivo or in vitro) or cell-fi-ee 

10 mixtures. In preferred embodiments, the RNAi activity is activated or added to a 
mammalian cell, e.g., a human cell, which cell may be provided in vitro or as part of a 
whole organism. In other embodiments, the subject method is carried out using eukaryotic 
cells generally (except for oocytes) in culture. For instance, the Dicer en2yme may be 
activated by virtue of being recombinant^ expressed or it may be activated by use of an 

15 agent which (i) induces expression of the endogenous gene, (ii) stabilizes the protein from 
degradation, and/or (iii) allosterically modies the en2yme to increase its activity (by 
altering its Kcat, Km or both). 

A. Dicer and Argonaut Activities 

20 In certain embodiment, at least one of the activated RNAi enzymes is Dicer, or a 

homolog thereof In certain preferred embodiments, the present method provides for 
ectopic activation of Dicer. As used herem, the term "Dicer" refers to a protein which (a) 
mediates an RNAi response and (b) has an amino acid sequence at least 50 percent 
identical, and more preferablty at least 75, 85, 90 or 95 percent identical to SEQ ID No. 2 

25 or 4, and/or which can be encoded by a nucleic acid which hybridizes under wash 
conditions of 2 x SSC at 22^C, and more preferably 0.2 x SSC at 65*^0, to a nucleotide 
represented by SEQ ID No. 1 or 3. Accordingly, the mefliod may comprise introducing a 
dsRNA contruct into a cell in which Dicer has been recombinantly expressed or otherwise 
ectopically activated. 

30 In certain embodiment, at least one of the activated RNAi enzymes is Argonaut, or 

a homolog fliereof. In certain preferred embodhnents, the present method provides for 
ectopic activation of Argonaut used herein, the term "Argonauf refers to a protein 
which (a) mediates an RNAi response and (b) has an amino acid sequence at least 50 
percent identical, and more preferablty at least 75, 85, 90 or 95 percent identical to the 

35 amino acid sequence shown in Figure 24. Accordingly, the method may comprise 

.14, 
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introducing a dsRNA contnict into a cell in which Argonaut has been recombinantly 
expressed or otherwise ectopically activated. 

This invention also provides expression vectors containing a nucleic acid encoding 
a Dicer or Argonaut polypeptides, operably linked to at least one transcriptional regulatory 
5 sequence. Operably linked is intended to mean that the nucleotide sequence is linked to a 
regulatory sequence in a manner which allows expression of the nucleotide sequence. 
Regulatory sequences are art-recognized and are selected to direct expression of the 
subject Dicer or Argonaut proteins. Accordingly, the term transcriptional regulatory 
sequence includes promoters, enhancers and other expression Control elements. Such 

10 regulatory sequences are described in Goeddel; Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, CA (1990). For instance, any of a wide 
variety of expression control sequences, sequences that control the expression of a DNA 
sequence when operatively linked to it, may be used in these vectors to express DNA 
sequences encoding Dicer or Argonaut polypeptides of this invention. Such useful 

15 expression control sequences, include, for example, a viral LTR, such as the LTR of the 
Moloney murme leukemia virus, the early and late promoters of SV40, adenovirus or 
cytomegalovirus immediate early promoter, the lac system, the trp system, the TAG or 
TRC system, T7 promoter whose expression is directed by T7 RNA polymerase, the major 
operator and promoter regions of phage A,, the control regions for fd coat protein, the 

20 promoter for 3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of 
acid phosphatase, e.g., Pho5, the promoters of the yeast a-mating factors, the polyhedron 
promoter of the baculovuais system and other sequences known to control the expression 
of genes of prokaryotic or eukaryotic cells or their viruses, and various combinations 
thereof. It should be understood that the design of the expression vector may depend on 

25 such factors as the choice of the host cell to be transformed and/or the type of protein 
desired to be expressed. 

Moreover, the vector's copy number, the ability to control that copy number and 
flie expression of any other proteins encoded by the vector, such as antibiotic markers, 
should also be considered. 

30 The recombinant Dicer or Argonaut genes can be produced by ligating nucleic acid 

encodmg a Dicer or Argonaut polypeptide into a vector suitable for expression in either 
prokaryotic cells, eukaryotic cells, or both. Expression vectors for production of 
recombinant forms of the subject Dicer or Argonaut polypeptides include plasmids and 
other vectors. For instance, suitable vectors for the expression of a Dicer or Argonaut 

35 polypeptide include plasmids of the types: pBR322-derived plasmids, pEMBL-derived 
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plasmids, pEX-derived plasmids, pBTac-derived plasmids and pUC-derived plasmids for 
expression in prokaryotic ceils, such as coli. 

A number of vectors exist for the expression of recombinant proteins in yeast. For 
instance, YEP24, YIPS, YEP51, YEP52, pYES2, and YRP17 are cloning and expression 
5 vehicles useful in the introduction of genetic constructs into S. cerevisiae (see, for 
example, Broach et al. (1983) in Experimental Manipulation of Gene Expression, ed. M. 
Inouye Academic Press, p. 83, incorporated by reference herein). These vectors can 
replicate in E. coli due the presence of the pBR322 ori, and in S. cerevisiae due to the 
replication determinant of the yeast 2 micron plasmid. In addition, drug resistance 
10 markers such as ampicillin can be used. In an illustrative embodunent, a Dicer or 
Argonaut polypeptide is produced recombinantly utilizing an expression vector generated 
by sub-cloning the coding sequence of a Dicer or Argonaiit gene. 

The preferred mammalian expression vectors contain both prokaryotic sequences, 
to facilitate the propagation of the vector in bacteria, and one or more eukaryotic 

15 transcription units that are expressed in eukaryotic cells. The pcDNAI/amp, pcDNAI/neo, 
pRc/CMV, pSV2gpt, pSV2neo, pSV2-dhfr, pTk2, pRSVneo, pMSG, pSVT7, pko-neo and 
pHyg derived vectors are examples of mammalian expression vectors suitable for 
transfection of eukaryotic cells. Some of these vectors are modified with sequences from 
bacterial plasmids, such as pBR322, to facilitate replication and drug resistance selection 

20 m both prokaryotic and eukaryotic cells. Alternatively, derivatives of viruses such as the 
bovine papillomavirus (BPV-1), or Epstem-Barr virus (pHEBo, pREP-derived and p205) 
can be used for transient expression of proteins in eukaryotic cells. The various methods 
employed in the preparation of the plasmids and transformation of host organisms are well 
known in the art. For other suitable expression systems for both prokaryotic and 

25 eukaryotic cells, as well as general recombinant procedures, see Molecular Cloning A 
Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor 
Laboratory Press: 1989) Chapters 16 and 17. 

In yet another embodiment, the subject invention provides a "gene activation" 
construct which, by homologous recombmation with a genomic DNA^ alters the 

30 transcriptional regulatory sequences of an endogenous Dicer or Argonaut gene. For 
mstance, the gene activation construct can replace the endogenous promoter of a Dicer or 
Argonaut gene with a heterologous promoter, e.g., one which causes constitutive 
expression of the Dicer or Argonaut gene or which causes inducible expression of the gene 
under conditions different from the normal expression pattern of Dicer or Argonaut. A 

35 variety of different formats for the gene activation constructs are available. See, for 
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example, the Transkaryotic Therapies, Inc PCX publications WO93/09222, WO95/31560, 
W096/29411, WO95/31560 and WO94/12650, 

In preferred embodiments, the nucleotide sequence used as the gene activation 
construct can be comprised of (1) DNA from some portion of the endogenous Dicer or 
5 Argonaut gene (exon sequence, intron sequence, promoter sequences, etc.) which direct 
recombination and (2) heterologous transcriptional regulatory sequence(s) which is to be 
operably linked to the coding sequence for the genomic Dicer or Argonaut gene upon 
recombination of the gene activation construct. For use in generating cultures of Dicer or 
Argonaut producing cells, the construct may further include a reporter gene to detect the 
10 presence of the knockout construct in the cell. 

The gene activation construct is uiserted into a cell, and integrates with the 
genomic DNA of the cell m such a position so as to provide the heterologous regulatory 
sequences in operative association with the native Dicer or Argonaut gene. Such insertion 
occurs by homologous recombination, i.e., recombination regions of the activation 
15 construct that are homologous to the endogeno\is Dicer or Argonaut gene sequence 
hybridize to the genomic DNA and recombme with the genomic sequences so that the 
construct is incorporated into the corresponding position of the genomic DNA, 

The terms "recombination region" or ^targeting sequence" refer to a segment (i.e., 
a portion) of a gene activation construct having a sequence that is substantially identical to 
20 or substantially complementary to a genomic gene sequence, e.g., including 5' flanking 
sequences of the genomic gene, and can facilitate homologous recombination between the 
genomic sequence and the targeting transgene construct. 

As used herein, the term "replacement region" refers to a portion of a activation 
construct which becomes integrated into an endogenous chromosomal location following 
25 homologous recombination between a recombination region and a genomic sequence. 

The heterologous regulatory sequences, e.g., which are provided in the 
replacement region, can mclude one or more of a variety elements, including: promoters 
(such as constitutive or inducible promoters), enhancers, negative regulatory elements, 
locus control regions, transcription factor bmdmg sites, or combinations thereof. 

30 Promoters/enhancers which may be used to control the expression of the targeted 

gene in vivo include, but are not limited to, the cytomegalovirus (CMV) 
promoter/enhancer (Karasuyama et al., 1989, J. Exp. Med, 169:13), the human P-actm 
promoter (Gunning et al. (1987) PNAS 84:4831-4835), the glucocorticoid-mducible 
promoter present in the mouse mammary tumor virus long termmal repeat (MMTV LTR) 

35 (Klessig et al, (1984) Mol Cell Biol 4:1354-1362), the long terminal repeat sequences of 
Moloney murme leukemia vuns (MuLV LTR) (Weiss et al. (1985) RNA Tumor Viruses, 
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Cold Spring Harbor Laboratory, Cold Spring Harbor, New York), the SV40 early or late 
region promoter (Bemoist et al. (1981) Nature 290:304-310; Templeton et al. (1984) Mol 
Cell Biol, 4:817; and Sprague et al. (1983) 1 Virol, 45:773), the promoter contained in 
the y long terminal repeat of Rous sarcoma virus (RSV) (Yamaraoto et al, 1980, Cell, 
5 22:787-797), the herpes simplex virus (HSV) thymidine kinase promoter/enhancer 
(Wagner et al, (1981) PNAS 82:3567-71), and the herpes simplex vmis LAT promoter 
(Wolfe etal. (1992) Nature Genetics, 1:379-384). 

In still other embodiments, the replacement region merely deletes a negative 
transcriptional control element of the native gene, e.g., to activate expression, or ablates a 
10 positive control element, e.g., to inhibit e^qpression of the targeted gene. 

B. Cell/Organism 

The cell with the target gene may be derived from or contained in any organism 
(e.g., plant, animal, protozoan, virus, bacterium, or fungus). The dsRNA construct may be 
15 synthesized either in vivo or in vitro. Endogenous RNA polymerase of the cell may 
mediate transcription in vivo, or cloned RNA polymerase can be used for transcription in 
vivo or in vitro. For generating double stranded transcripts from a transgene in vivo, a 
regulatory region may be used to transcribe the RNA strand (or strands). 

Furthermore, genetic manipulation becomes possible in organisms tiiat are not 
20 classical genetic models. Breeding and screening programs may be accelerated by the 
ability to rapidly assay the consequences of a specific, targeted gene disruption. Gene 
disruptions may be used to discover the function of the target gene, to produce disease 
models in which the target gene are involved in causing or preventing a pathological 
condition, and to produce organisms with improved economic properties. 

25 The cell with the target gene may be derived from or contained in any organism. 

The organism may a plant, animal, protozoan, bacterium, virus, or fungus. The plant may 
be a monocot, dicot or gymnosperm; the animal may be a vertebrate or invertebrate. 
Preferred microbes are those used in agriculture or by industry, and those that are 
pathogenic for plants or animals. Fungi include organisms in both the mold and yeast 

30 morphologies. 

Plants include arabidopsis; field crops (e.g., alfalfa, barley, bean, com, cotton, flax, 
pea, rape, rice, rye, saflOiower, sorghum, soybean, sunflower, tobacco, and wheat); 
vegetable crops (e.g., asparagus, beet, broccoli, cabbage, carrot, cauliflower, celery, 
cucumber, eggplant, lettuce, onion, pepper, potato, pimipkin, radish, spinach, squash, taro, 
35 tomato, and zucchini); Suit and nut crops (e.g., abnond, apple, apricot banana, blackberry, 
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blueberry, cacao, cheny, coconut, cranberry, date, faJoa, filbert, grape, grapefruit, guava, 
kiwi, lemon, lime, mango, melon, nectarine, orange, papaya, passion fruit, peach, peanut, 
pear, pineapple, pistachio, plum, raspberry, strawberry, tangerine, walnut, and 
watermelon); and ornamentals (e.g,, alder, ash, aspen, azalea, bkch, boxwood, camellia, 
5 carnation, chrysanthemum, elm, fir, ivy, jasmine, juniper, oak, palm, poplar, pine, 
redwood, rhododendron, rose, and rubber). 

Examples of vertebrate animals include fish, mammal, cattle, goat, pig, sheep, 
rodent, hamster, mouse, rat, primate, and human. 

Invertebrate animals include nematodes, other worms, drosophila, and other 
10 insects. Representative generae of nematodes include those that mfect animals (e.g., 
Ancylostoma, Ascaridia, Ascaris, Bunostomum, Caenorhabditis, Capillaria, Chabertia, 
Cooperia, Dictyocaulus, Haemonchus, Heterakis, Nematodirus, Oesophagostomum, 
Ostertagia, Oxyuris, Parascaris, Strongylus, Toxascaris, Trichuris, Trichostrongylus, 
Tflichonema, Toxocara, Uncmaria) and those that infect plants (e.g., B ursaphalenchus, 
15 Criconerriella, Diiylenchus, Ditylenchus, Globodera, Helicotylenchus, Heterodera, 
Longidorus, Melodoigyne, Nacobbus, Paratylenchus, Pratylenchus, Radopholus, 
Rotelynchus, Tylenchus, and Xiphinema). Representative orders of insects include 
Coleoptera, Diptera, Lepidoptera, and Homoptera. 

The cell having the target gene may be from the germ line or somatic, totipotent or 
20 pluripotent, dividmg or non-dividing, parenchyma or epithelium, immortalized or 
transformed, or the like. The cell may be a stem cell or a differentiated cell. Cell types that 
are diJfferentiated include adipocytes, fibroblasts, myocytes, cardiomyocytes, endothelium, 
neurons, glia, blood cells, megakaryocytes, lymphocytes, macrophages, neutrophils, 
eosinophils, basophils, mast cells, leukocytes, granulocytes, keratinocytes, chondrocytes, 
25 osteoblasts, osteoclasts, hepatocytes, and cells of the endocrine or exocrine glands. 

C. Targeted Genes 

The target gene may be a gene derived from the cell, an endogenous gene, a 
transgene, or a gene of a pathogen which is present in the cell after infection thereof. 

30 Depending on the particular target gene and the dose of double stranded RNA material 
delivered, the procedure may provide partial or complete loss of fimction for the target 
gene. Lower doses of injected material and longer times after administration of dsRNA 
may result in inhibition in a smaller fraction of cells. Quantitation of gene expression in a 
cell may show smailar amounts of inhibition at the level of accumulation of target mRNA 

35 or translation of target protem. 
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"Inhibition of gene expression" refers to the absence (or observable decrease) in 
Ifae level of protein and/or mRNA product from a target gene. "Specificity" refers to the 
ability to mhibit the target gene without manifest effects on other genes of the cell. The 
consequences of inhibition can be confirmed by exammation of the outward properties of 
5 the cell or organism (as presented below m the examples) or by biochemical techniques 
such as RNA solution hybridization, nuclease protection, Northern hybridization, reverse 
transcription, gene expression monitormg with a microarray, antibody bmdmg, enzyme 
linked immunosorbent assay (ELISA), Western blotting, radioimmunoassay (RIA), other 
unmunoassays, and fluorescence activated cell analysis (FACS). For RNA-mediated 

10 inhibition in a ceU line or whole organism, gene expression is conveniently assayed by use 
of a reporter or drug resistance gene whose protein product is easily assayed. Such reporter 
genes include acetohydrdxyacid synfliase (AHAS), alkaline phosphatase (AP), beta 
galactosidase (LacZ), beta glucoronidase (GUS), chloramphenicol acetyltransferase 
(CAT), green fluorescent protem (GFP), horeeradish peroxidase (HRP), luciferase (Luc), 

15 nopaline synthase (NOS), octopine synthase (OCS), and derivatives thereof multiple 
selectable markers are available that confer resistance to ampicillin, bleomycin, 
chloramphenicol, gentamycin, hygromycm, kanamycin, lincomycin, methotrexate, 
phosphinothricin, puromycin, and tetracyclm. 

Depending on the assay, quantitation of the amount of gene expression allows one 
20 to determine a degree of kihibition which is greater than 10%, 33%, 50%, 90%, 95% or 
99% as compared to a cell not treated accordmg to the present invention. Lower doses of 
mjected material and longer times after administration of dsRNA may result in inhibition 
m a smaller flection of cells (e.g., at least 10%, 20%, 50%, 75%,90%, or 95% of targeted 
cells). Quantitation of gene expression m a cell may show similar amounts of inhibition at 
25 the level of accumulation of target mRNA or translation of target protem. As an example, 
the efficiency of inhibition may be determined by assessing the amount of gene product in 
the cell: mRNA may be detected witti a hybridization probe having a nucleotide sequence 
outside the region used for the inhibitory double-stranded RNA, or translated polypeptide 
may be detected with an antibody raised against the polypeptide sequence of that region. 

30 As disclosed herein, the present invention may is not limited to any type of target 

gene or nucleotide sequence. But the following classes of possible target genes are listed 
for illustrative purposes: developmental genes (e.g., adhesion molecules, cycUn kmase 
inhibitors, Writ family members. Pax family members. Winged helix family members, 
Hox family members, cytokines/lymphokines and their receptors, growth/differentiation 

35 factors and their receptors, neurotransmitters and their receptors); oncogenes (e.g., ABLI, 
BCLI, BCL2, BCL6, CBFA2, CBL, CSFIR, ERBA, ERBB, EBRB2, ETSI, ETSl, ETV6, 
FOR, FOS, FYN, HCR, HRAS, JUN, KRAS, LCK, LYN, MDM2, MLL, MYB, MYC, 
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MYCLI, MYCN, NRAS, PIM 1, PML, RET, SRC, TALI, TCL3. and YES); tumor 
suppressor genes (e.g., APC, BRCA 1, BRCA2, MADH4, MCC, NF 1, NF2, RB 1, TP53, 
and WTI); and enzymes (e.g., ACC synthases and oxidases, AC? desaturases and 
hydroxylases, ADP-glucose pyrophorylases, ATPases, alcohol dehydrogenases, amylases, 
5 amyloglucosidases, catalases, cellulases, chalcone synthases, chitinases, cyclooxygenases, 
decarboxylases, dextrinases, DNA and RNA polymerases, galactosidases, glucanases, 
glucose oxidases, granule-bound starch synthases, GTPases, helicases, hemicellulases, 
integrases, inulinases, invertases, isomerases, kinases, lactases, lipases, lipoxygenases, 
lysozymes, nopaline synthases, octopine synthases, pectmesterases, peroxidases, 
10 phosphatases, phospholipases, phosphorylases, phytases, plant growth regulator synthases, 
: polygalacturonases, proteinases and peptidases, puUanases, recombinases, reverse 
transcriptases, RUBISCOs, topoisomerases, and xylanases). 

D. dsRNA constructs 

15 The dsRNA construct may comprise one or more stonds of polymerized 

ribonucleotide. It may include modifications to either the phosphate-sugar backbone or the 
nucleoside. For example, the phosphodiester linkages of natural RNA may be modified to 
include at least one of a nitrogen or sulfur heteroatom. Modifications in RNA structure 
may be tailored to allow specific genetic mhibition while avoiding a general panic 

20 response m some organisms which is generated by dsRNA. Likewise, bases may be 
modified to block the activity of adenosine deaminase. The dsRNA construct may be 
produced enzymatically or by partial/total organic synthesis, any modified ribonucieotide 
can be introduced by in vitro enzymatic or organic synthesis. 

The dsRNA construct may be direcfly introduced into the cell (i.e., intracellularly); 

25 or introduced extracellularly into a cavity, interstitial space, into the circulation of an 
organism, introduced orally, or may be mtroduced by bathing an organism in a solution 
containing RNA. Methods for oral introduction include direct mixing of RNA with food of 
the organism, as well as engineered approaches in which a species that is used as food is 
engineered to express an RNA, then fed to the organism to be affected. Physical methods 

30 of introducing nucleic, acids mclude injection directly into tiie cell or extmcellular 
injection into the organism of an RNA solution. 

The double-stranded structure may be formed by a single self-complementary 
RNA strand or two complementary RNA strands. RNA duplex formation may be initiated 
either inside or outside the cell. The RNA may be introduced in an amount which allows 
35 delivery of at least one copy per cell. Higher doses (e.g., at least 5, 10, 100, 500 or 1000 
copies per cell) of double-stranded material may yield more effective inhibition; lower 
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doses may also be useful for specific applications. Inhibition is sequence-specific in that 
nucleotide sequences corresponding to the duplex region of the RNA are targeted for 
genetic inhibition. 

dsRNA constructs containing a nucleotide sequences identical to a portion of fee 
5 target gene are preferred for inhibition. RNA sequences with msertions, deletions, and 
single point mutations relative to the target sequence have also been found to be effective 
for inhibition. Thus, sequence identity may optimized by sequence comparison and 
alignment algorithms known in the art (see Gribskov and Devereux, Sequence Analysis 
Primer, Stockton Press, 199 1, and references cited therein) and calculating the percent 

10 difference between the nucleotide sequences by, for example, the Smith-Waterman 
algorithm as implemented m the BESTFIT software program using default parameters 
(e.g., University of Wisconsin Genetic Computing iGroup). Greater than 90% sequence 
identity, or even 100% sequence identity, between the inhibitory RNA and the portion of 
the target gene is preferred. Alternatively, the duplex region of the RNA may be defined 

15 functionally as a nucleotide sequence that is capable of hybridizmg with a portion of the 
target gene transcript (e.g., 400 mM NaCl, 40 mM PIPES pH 6,4, 1 mM EDTA, 50*C or 
70'C hybridization for 12-16 hours; followed by washing). The length of the identical 
nucleotide sequences may be, for example, at least 25, 50, 100, 200, 300 or 400 bases. In 
certain embodiments, the dsRNA construct is 400-800 bases in length. 

20 100% sequence identity between the RNA and the target gene is not required to 

practice the present invention. Thus the invention has the advantage of being able to 
tolerate sequence variations that might be expected due to genetic mutation, strain 
polymorphism, or evolutionary divergence. 

The dsRNA construct may be synthesized either in vivo or in vitro. Endogenous 
25 RNA polymerase of the cell may mediate transcription ui vivo, or cloned RNA 
polymerase can be used for transcription in vivo or in vitro. For transcription from a 
transgene in vivo or an expression construct, a regulatory region (e,g,, promoter, enhancer, 
silencer, splice donor and acceptor, polyadenylation) may be used to transcribe the dsRNA 
strand (or strands). Inhibition may be targeted by specific transcription in an organ, tissue, 
30 or cell type; stimulation of an environmental condition (e.g., infection, stress, temperature, 
chemical inducers); and/or engineering transcription at a developmental stage or age. The 
RNA strands may or may not be polyadenylated; the RNA strands may or may not be 
capable of being translated into a polypeptide by a cell's translational apparatus. TTie 
dsRNA construct may be chemically or enzymatically synthesized by manual or 
35 automated reactions. The dsRNA construct may be synthesized by a cellular RNA 
polymerase or a bacteriophage RNA polymerase (e.g., T3, T7, SP6). Hie use and 
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production of an expression construct are known in the art3233,34 (see also WO 
97/32016; U.S. Pat. Nos. 5,593,874, 5,698,425, 5,712,135, 5,789,214, and 5,804,693; and 
the references cited therein). If synthesized chemically or by in vitro en2ymatic synthesis, 
the RNA may be purified prior to introduction mto the cell. For example, RNA can be 
5 pxmified from a mixture by extraction with a solvent or resin, precipitation, 
electrophoresis, chromatography or a combination thereof. Alternatively, the dsRNA 
construct may be used with no or a mmimum of purification to avoid losses due to sample 
processing.. The dsRNA construct may be dried for storage or dissolved m an aqueous 
solution. The solution may contain buffers or salts to promote annealing, and/or 
10 stabilization of the duplex strands. 

Physical metfiods of introducing nucleic acids include injection of a solution 
containing the dsRNA construct, bombardment by particles covered by the dsRNA 
construct, soaking the cell or organism in a solution of the RNA, or electroporation of cell 
membranes ui the presence of the dsRNA construct. A viral construct packaged into a viral 
15 particle would accomplish both efficient introduction of an expression construct into the 
cell and transcription of dsRNA construct encoded by the expression construct Other 
methods known m the art for introducing nucleic acids to cells may be used, such as lipid- 
mediated carrier transport, chemicahnediated transport, such as calcium phosphate, and 
the like. Thus the dsRNA construct may be introduced along with components that 
20 perform one or more of the following activities: enhance KNA uptake by the cell, promote 
annealing of the duplex strands, stabilize the annealed strands, or other-wise increase 
inhibition of the target gene. 

E. Illustrative Uses 

One utility of the present invention is as a method of identifying gene function m 
an organism, especially higher eukaryotes comprising the use of double-stranded RNA to 
mhibit the activity of a target gene of previously unknown function. Instead of the time 
consumiag and laborious isolation of mutants by traditional genetic screening, functional 
genomics would envision determining the function of uncharacterized genes by employing 
the invention to reduce the amount and/or alter the timing of target gene activity. The 
invention could be used in determining potential targets for pharmaceutics, understandmg 
normal and pathological events associated with development, determining signaling 
pathways responsible for postnatal development/aging, and the like. The increasing speed 
of acquiring nucleotide sequence information from genomic and expressed gene sources, 
including total sequences for mammalian genomes, can be coupled with the invention to 
determine gene function in a cell or in a whole organism. The preference of difiFerent 
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organisms to use particular codons, searching sequence databases for related gene 
products, correlating the linkage map of genetic traits with the physical map from which 
the nucleotide sequences are derived, and artificial intelligence methods may be used to 
define putative open reading frames from the nucleotide sequences acquired in such 
5 sequencing projects. 

A simple assay would be to inhibit gene expression according to the partial 
sequence available from an expressed sequence tag (EST). Functional alterations in 
growth, development, metabolism, disease resistance, or other biological processes would 
be mdicative of the normal role of the ESTs gene product. 

10 The ease with which the dsRNA construct can be introduced into an intact 

cell/organism containing the target gene allows the present invention to be used in high 
throughput screening (HTS). For example, duplex RNA can be produced by an 
amplification reaction using pruners flanking the mserts of any gene library derived from 
the target celUorganism. Inserts may be derived from genomic DNA or mRNA (e.g., 

15 cDNA and cRNA). Individual clones from the library can be replicated and then isolated 
in separate reactions, but preferably the library is maintained in mdividual reaction vessels 
(e.g., a 96 well microtiter plate) to minimize the number of steps required to practice the 
invention and to allow automation of the process. Solutions containing duplex RNAs that 
are capable of inhibiting the different expressed genes can be placed into individual weUs 

20 positioned on a microtiter plate as an ordered array, and intact cells/organisms in each well 
can be assayed for any changes or modifications in behavior or development due to 
inhibition of target gene activity. The amplified RNA can be fed directly to, injected into, 
the cell/organism containing the target gene. Alternatively, the duplex RNA can be 
produced by in vivo or in vitro transcription from an expression construct used to produce 

25 the library. The construct can be replicated as mdividual clones of the library and 
transcribed to produce the RNA; each clone can then be fed to, or mjected mto, the 
cell/organism containing the target gene. The function of the target gene can be assayed 
from the effects it has on the cell/organism when gene activity is inhibited. This screening 
could be amenable to small subjects that can be processed in large number, for example, 

30 tissue culture cells derived from mammals, especially primates, and most preferably 
humans. 

If a characteristic of an organism is determined to be genetically linked to a 
polymorphism through RFLP or QTL analysis, the present invention can be used to gain 
msight regarding whether that genetic polymorphism might be directly responsible for the 
35 characteristic. For example, a fragment defining the genetic polymorphism or sequences in 
the vicmity of such a genetic polymorphism can be amplified to produce an RNA, the 
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duplex RNA can be introduced to the organism or cell, and whether an alteration in the 
charactenstic is correlated witii inhibition can be determined. Of course, there may be 
trivial explanations for negative results with this type of assay, for example: inhibition of 
the target gene causes lethality, inhibition of the target gene may not result in any 
5 observable alteration, the fragment contains nucleotide sequences that are not capable of 
inhibiting the target gene, or the target gene's activity is redundant 

The present invention may be useful in allowing the inhibition of essential genes. 
Such genes may be required for cell or organism viability at only particular slages of 
development or cellular compartments. Hie functional equivalent of conditional mutations 
10 may be produced by inhibiting activity of the target gene when or where it is not required 
for viability. Tlie invention allows addition of RNA at specific times of development and 
locations in the organism without introducing permanent mutations into the target genome. 

If alternative splicing produced a f^ily of transcripts that were distinguished by 
usage of characteristic exons, the present invention can target inhibition throu^ the 

15 appropriate exons to specifically inhibit or to distinguish among tiie functions of family 
members. For example, a hormone that contained an alternatively spliced transmembrane 
domain may be expressed in both membrane bound and secreted forms. Instead of 
isolating a nonsense mutation that terminates translation before the transmembrane 
domain, the functional consequences of having only secreted hormone can be determined 

20 accordmg to the invention by targeting the exon containing the transmembrane domain 
and thereby inhibiting expression of membrane-bound hormone. 

The present invention may be used alone or as a component of a kit having at least 
one of the reagents necessary to carry out the in vitro or in vivo introduction of RNA to 
test samples or subjects. Preferred components are the dsRNA and a vehicle that promotes 
25 introduction of the dsRNA. Such a kit may also include instructions to allow a user of the 
kit to practice die invention. 

Alternatively, an organism may be engineered to produce dsRNA which produces 
commercially or medically beneficial results, for example, resistance to a pathogen or its 
pathogenic effects, unproved growth, or novel developmental pattems. 

30 

IV, Exemplification 

The invention, now being generally described, wiU be more readily understood by 
reference to the following examples, which are included merely for pxu-poses of illustration 
of certain aspects and embodiments of the present mvention and are not intended to limit 
35 the mvention. 
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Example 1: An RNA-directed nuclease mediates KNAi gene silencing 

In a diverse group of organisms that includes Caenorhabditis eleganSy Drosophila, 
planaria, hydra, trypanosomes, fungi and plants, the introduction of double-stranded RNAs 
5 inhibits gene expression in a sequence-specific manner^. These responses, called RNA 
interference or post-transcriptional gene silencing, may provide anti-viral defence, 
modulate transposition or regulate gene expression^ ^ We have taken a biochemical 
approach towards elucidating the mechanisms underlying this genetic phenomenon. Here 
we show that loss-of-fimction* phenotypes can be created in cultured Drosophila cells by 

10 transfection with specific double-stranded RNAs. This coincides with a marked reduction 
m the level of cognate cellular messenger RNAs. Extracts of transfected cells contam a 
nuclease activity that specifically degrades exogenous transcripts homologous to 
transfected double-stranded RNA. This enzyme contains an essential RNA component 
After partial purification, the sequence-specific nuclease co-fractionates with a discrete, 

15 -25 -nucleotide RNA species which may confer specificity to the enzyme through 
homology to the substrate mRNAs. 

Although double-stranded RNAs (dsRNAs) can provoke gene silencmg in 
nimierous biological contexts mcluding Drosophilc^ ^, the mechanisms underlying this 
phenomenon have remained mostly unknown. We therefore wanted to establish a 
20 biochemically tractable model in which such mechanisms could be investigated. 

Transient transfection of cultured, Drosophila S2 cells with a lacZ expression 
vector resulted in p-galactosidase activity that was easily detectable by an in situ assay 
(Pip. la\ This activity was greatly reduced by co-transfection with a dsRNA 
corresponding to the first 300 nucleotides of the lacZ sequence, whereas co-transfection 
25 with a control dsRNA (CDS) (Fig, la) or with single-stranded RNAs of either sense or 
antisense orientation (data not shown) had little or no effect. This indicated that dsRNAs 
could interfere, in a sequence-specific fashion, with gene expression in cultured cells* 

To determine whether RNA interference (RNAi) could be used to target 
endogenous genes, we transfected S2 cells with a dsRNA corresponding to the first 540 

30 nucleotides of Drosophila cyclin E, a gene that is essential for progression into S phase of 
the cell cycle. During log-phase growth, untreated S2 cells reside primarily in G2/M (Fig. 
lb). Transfection with lacZ dsRNA had no effect on cell-cycle distribution, but 
transfection with the cyclin E dsRNA caused a Gl-phase cell-cycle arrest (Fig, lb) . The 
ability of cyclin E dsRNA to provoke this response was length-dependent Double- 

35 stranded RNAs of 540 and 400 nucleotides were quite effective, whereas dsRNAs of 200 
and 300 nucleotides were less potent. Double-stranded cyclin E RNAs of 50 or 100 
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nucleotides were inert in our assay, and transfection with a single-stranded, antisense 
cyclin E RNA had virtually no effect. 

One hallmark of RNAi is a reduction in the level of mRNAs that are homologous 
to the dsRNA. Cells transfected with the cyclin E dsRNA (bulk population) showed 
5 dunmished endogenous cyclin E mRNA as compared with control cells (Fig, Ic) . 
Similarly, transfection of cells wifli dsRNAs homologous to fizzy, a component of the 
anaphase-promoting complex (APC) or cyclin A, a cyclin that acts in S, G2 and M, also 
caused reduction of their cognate mRNAs (Fig. Ic) . Hxe modest reduction in fizzy mRNA 
levels in cells transfected with cyclin A dsRNA probably resulted from arrest at a point m 
10 the division cycle at which fizzy transcription is low^ ^. These results indicate that KNAi 
may be a generally applicable method for probing gene function in cultured Drosophila 
cells. 

The decrease in mRNA levels observed upon transfection of specific dsRNAs into 
Drosophila cells could be explained by effects at transcriptional or post-transcriptional 
15 levels. Data from other systems have indicated that some elements of the dsRNA response 
may affect mRNA directly (reviewed in refs 1 and 6). We therefore sought to develop a 
cell-free assay that reflected, at least in part, RNAi. 

S2 cells were transfected with dsRNAs corresponding to either cyclin E or /acZ. 
Cellular extracts were incubated with synthetic mRNAs of lacZ or cyclin E. Extracts 

20 prepared from cells transfected with the 540-nucleotide cyclin E dsRNA efiHciently 
degraded the cyclin E transcript; however, the lacZ transcript was stable in these lysates 
(Fig- 2al . Conversely, lysates from cells transfected with the lacZ dsRNA degraded the 
tocZ transcript but left the cyclin E mRNA intact. These results indicate that RNAi ablates 
target mRNAs tiirough the generation of a sequence-specific nuclease activity. We have 

25 termed this enzyme RISC (RNA-induced silencing complex). Although we occasionally 
observed possible intermediates in the degradation process (see Fig. 1\ the absence of 
stable cleavage end-products indicates an exonuclease (perhaps coupled to an 
endonuclease). However, it is possible that the RNAi nuclease makes an mitial 
endonucleolytic cut and that non-specific exonucleases in the extract complete the 

30 degradation process^. In addition, our ability to create an extract that targets lacZ in vitro 
indicates that the presence of an endogenous gene is not required for the RNAi response. 

To examine the substrate requirements for the dsRNA-induced, sequence-specific 
nuclease activity, we incubated a variety of cyc//w-£-derived transcripts with an extract 
derived from cells that had been transfected with the 540-nucleotide cyclin E dsRNA (Fig. 
35 2b, c). Just as a length requuement was observed for the transfected dsRNA the RNAi 
nuclease activity showed a dependence on the size of the RNA substrate. Botii a 600- 
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nucleotide transcript that extends slightly beyond the targeted region ( Fig. 2b) and an -l- 
kilobase (kb) transcript that contains the entire coding* sequence (data not shown) were 
completely destroyed by the extract. Surprisingly, shorter substrates were not degraded as 
efficiently. Reduced activity was observed against either a 300- or a 220-nucleotide 
5 transcript, and a 100-nucleotide transcript was resistant to nuclease in our assay. This was 
not due solely to position effects because -100-nucleotide transcripts derived from other 
portions of the transfected dsRNA behaved similarly (data not shown). As expected, the 
nuclease activity (or activities) present in the extract could also recognize the antisense 
strand of the cyclin E mRNA. Again, substrates that contained a substantial portion of the 

10 targeted region were degraded efficiently whereas those that contained a shorter stretch of 
homologous sequence (-130 nucleotides) were recognized inefficiently (Fig. 2c, as600). 
For both the sense and antisense strands, transcripts that had no homology with the 
transfected dsRNA ( Fig. 2b. Eout; Fig. 2c. as300) were not degraded. Although we cannot 
exclude the possibility that nuclease specificity could have migrated beyond the targeted 

15 region, the resistance of transcripts that do not contain homology to the dsRNA is 
consistent with data from C elegam. Double-stranded RNAs homologous to an upstream 
cistron have little or no effect on a Imked dovmstream cistron, despite the fact that 
unprocessed, polycistronic mKNAs can be readily detected^ ^. Furthermore, the nuclease 
was inactive against a dsRNA identical to that used to provoke the RNAi response in \tvo 

20 (Fig. 2b) . In the in vitro system, neither a 5' cap nor a poly(A) tail was required, as such 
transcripts were degraded as efficiently as uncapped and nou-polyadenylated RNAs. 

Gene silencing provoked by dsRNA is sequence specific. A plausible mechanism 
for determming specificity would be uicorporation of nucleic-acid guide sequences into 
the complexes that accomplish silencing^. In accord with this idea, pre-treatment of 

25 extracts with a Ca^^-dependent nuclease (micrococcal nuclease) abolished the ability of 
these extracts to degrade cognate mRNAs (Fig. 3) . Activity could not be rescued by 
addition of non-specific RNAs such as yeast transfer RNA. Although micrococcal 
nuclease can degrade both DNA and RNA, treatment of the extract with DNAse I had no 
effect (Fig. 3V Sequence-specific nuclease activity, however, did requke protein (data not 

30 shown). Together, our results support the possibility that flie RNAi nuclease is a 
ribonucleoprotein, requiring both RNA and protein components. Biochemical 
fractionation (see below) is consistent with these components being associated in extract 
rather tiian being assembled on the target mRNA after its addition. 

In plants, the phenomenon of co-suppression has been associated with the 
35 existence of small (-25-nucleotide) RNAs that correspond to the gene that is being 
silenced^. To address tiie possibility that a similar RNA might exist in Drosophila and 
guide the sequence-specific nuclease in the choice of substmte, we partially purified our 
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activity through several fractionation steps. Crude extracts contained both sequence- 
specific nuclease activity and abundant, heterogeneous RNAs homologous to the 
transfected dsRNA (Figs 2 and 4a). The RNAi nuclease fractionated with ribosomes in a 
high-speed centrifijgation step. Activity could be e?rtracted by treatment with high salt, and 
5 ribosomes could be removed by an additional centrifixgation step. Chromatography of 
soluble nuclease over an anion-exchange column resulted in a discrete peak of activity 
CFig. 4b. cyclin E). This retained specificity as it was inactive against a heterologous 
mRNA (Fig. 4b. lacZ), Active fractions also contained an RNA species of 25 nucleotides 
that is homologous to the cyclin E target (Fig. 4b. northern). The band observed on 
10 norttiem blots may represent a family of discrete RNAs because it could be detected with 
probes specific for both the sense and antisense cyclin E sequences and with probes 
derived from distinct segments of the dsRNA (data not shown). At present, we cannot 
determine whether the 25-nucleotide RNA is present in the nuclease complex in a double- 
stranded or single-stranded form. 

15 RNA interference allows an adaptive defence against both exogenous and 

endogenous dsRNAs, providing somethmg akin to a dsRNA immxme response. Our data, 
and that of others^, is consistent with a model in which dsRNAs present m a cell are 
converted, either through processing or replication, into small specificity determinants of 
discrete size in a manner analogous to antigen processing. Our results suggest that the 

20 post-transcriptional component of dsRNA-dependent gene silencing is accomplished by a 
sequence-specific nuclease that incorporates these small RNAs as guides that target 
specific messages based upon sequence recognition. The identical size of putative 
specificity determinants in plants^ and animals predicts a conservation of both the 
mechanisms and the components of dsRNA-induced, post-transcriptional gene silencing m 

25 diverse organisms. In plants, dsRNAs provoke not only post-transcriptional gene silencing 
but also chromatin remodellmg and transcriptional repression^ ^. It is now critical to 
determine whether conservation of gene-silencing mechanisms also exists at the 
transcriptional level and whether chromatin remodelling can be directed in a sequence- 
specific fashion by these same dsRNA-derived guide sequences. 

30 

Methods 

Cell culture and RNA methods S2 (ref 22) cells were cultured at 27 °C in 90% 
Schneider's msect media (Sigma), 10% heat inactivated fetal bovine serum (FBS). Cells 
were transfected with dsRNA and plasmid DNA by calcium phosphate co-precipitation^. 
35 Identical results were observed when cells were transfected using lipid reagents (for 
example, Superfect, Qiagen). For FACS analysis, cells were additionally transfected with 
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a vector that directs expression of a green fluorescent protein (GFP)-US9 fusion protein^. 
These cells were fixed in 90% ice-cold ethanol and stained with propidium iodide at 25 \xg 
ml '\ FACS was performed on an Elite flow cytoraeter (Coulter). For northern blotting, 
equal loading was ensured by over-probing blots with a control complementary DNA 
5 (RP49). For the production of dsRNA, transcription templates were generated by 
polymerase chain reaction such that they contained T7 promoter sequences on each end of 
the template. RNA was prepared using the RiboMax kit (Promega). Confirmation that 
RNAs were double stranded came from their complete sensitivity to RNAse III (a gift 
from A. Nicholson). Target mRNA transcripts were synthesized using the Riboprobe kit 
10 (Promega) and were gel purified before use. 

Extract preparation Log-phase S2 cells were plated on 15-cm tissue culture dishes and 
transfected with 30 ^ig dsRNA and 30 ^ig carrier plasmid DNA. Seventy-two hours after 
transfection, cells were harvested in PBS containing 5 mM EGTA washed twice m PBS 
and once m hypotonic buffer (10 mM HEPES pH7.3, 6mM p-mercaptoethanol). Cells 

15 were suspended in 0,7 packed-cell volumes of hypotonic buffer contaming Complete 
protease inhibitors (Boehringer) and 0.5 units ml'^ of RNasin (Promega). Cells were 
disrupted in a dounce homogenizer with a type B pestle, and lysates were centrifiiged at 
SOjOOOg for 20 min. Supematants were used in an in vitro assay containing 20 mM HEPES 
pH7.3, 110 mM KOAc, 1 mM Mg(0Ac)2, 3 mM EGTA, 2mM CaCh, 1 mM DTT. 

20 Typically, 5 ^1 extract was used m a 10 ^il assay that contamed also 10,000 c.p.m. 
synthetic mRNA substrate. 

Extract fractionation Extracts were centrifiiged at 200,000g for 3 h and the resulting 
pellet {containing ribosomes) was extracted in hypotonic buffer containing also 1 mM 
MgCl2 and 300 mM KOAc. The extracted material was spun at 100,000g for 1 h and the 

25 resulting supernatant was fractionated on Source 15Q column (Pharmacia) usmg a KCl 
gradient hi buffer A (20 mM HEPES pH7.0, 1 mM di&iothreitol, 1 mM MgCb). 
Fractions were assayed for nuclease activity as described above. For northern blotting, 
fractions were proteinase K/SDS treated, phenol extracted, and resolved on 15% 
acrylamide 8M urea gels. RNA was electroblotted onto Hybond N+ and probed with 

30 strand-specific riboprobes derived from cyclin E mRNA. Hybridization was carried out in 
500 mM NaP04 pH 7.0, 15% formamide, 7% SDS, 1% BSA. Blots were washed in 1 
SSC at 37-45 °C. 
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35 1, Sharp, P, A. RNAi and double-strand RNA. Genes Dev. 13, 139-141 (1999). 



-30- 



wo 01/68836 



PCT/USOl/08435 



2. Sanchez-Alvarado, A. & Newmark, P. A. Double-stranded RNA specifically disrupts 
gene expression during planarian regeneration. Proc. Natl Acad. Set USA 96, 5049-5054 
(1999). 

3. Lohmann, J. U., Endl, L & Bosch, T. C. Silencing of developmental genes in Hydra. 
5 Dev. Biol 214, 21 1-214 (1999). 

4. Cogoni, C. & Macino, G, Gene silencing in Neurospora crassa requires a protein 
homologous to RNA-dependent RNA polymerase. Nature 399, 166-169 (1999). 

5. Waterhouse, P. M., Graham, M. W. & Wang, M. B. Virus resistance and gene 
silencing in plants can be induced by simultaneous expression of sense and antisense 

10 RNA. Proc. Natl Acad, ScL USA 95, 13959-13964 (1998). 

6. Montgomery, M. K. & Fire, A. Double-stranded RNA as a mediator in sequence- 
specific genetic silencing ahd co-suppression. Trends Genet 14, 225-228 (1998). 

7. Ngo, H., Tschudi, C, Gull, K. & UUu, E. Double-stranded RNA mduces mRNA 
degradation in Trypanosoma brucei. Proc, Natl Acad, ScL USA 95, 14687-14692 (1998), 

15 8. Tabara, H. et al. The rde-1 gene, RNA interference, and transposon silencing in C. 
elegans. Cell 99, 123-132 (1999). 

9. Ketting, R. F., Haverkamp, T. H. A., van Luenen, H. G. A, M. & Plasterk, R, H. A. 
mut-7 of C. elegans, requhed for transposon silencing and RNA interference, is a homolog 
of Werner Syndrome helicase and RnaseD. Cell 99, 133-141 (1999). 

20 10. Ratcliff, F., Harrison, B. D. & Baulcombe, D. C. A similarity between viral defense 
and gene silencing in plants. Science 276, 1558-1560 (1997). 

11. Kennerdell, J. R. & Carthew, R. W. Use of dsRNA-mediated genetic mterference to 
demonstrate that fi-izzled and firizzled 2 act m the wingless pathway. Cell 95, 1017-1026 
(1998). 

25 12. Misquitta, L. & Paterson, B. M. Targeted disruption of gene fimction m Drosophila by 
RNA interference; a role for nautilus in embryonic somatic muscle formation. Proc. Natl 
Acad Set USA 96, 1451-1456 (1999). 

13. Kalejta, R. F., Brideau, A. D., Banfield, B. W. & Beavis, A. J. An mtegral membrane 
green fluorescent protein marker, Us9-GFP, is quantitatively retained in cells during 

30 propidium iodine-based cell cycle analysis by flow cytometry. Exp. Cell. Res, 248, 322- 
328 (1999). 

14. Wolf, D. A. & Jackson, P. K. Cell cycle: oiling the gears of anaphase. Curr. Biol. 8, 
R637-R639 (1998). 



-31- 



wo 01/68836 



PCTyUSOl/08435 



15. Kramer, E. R., GieflFers, C, Holz, G., Hengstschlager, M. & Peters, J. M. Activation of 
the human anaphase-promoting complex by proteins of the CDC20/fizzy family. Curr, 
Biol 8, 1207-1210 (1998). 

16. Shuttleworth, J. & Cohnan, A. Antisense oligonucleotide-directed cleavage of mRNA 
5 in Xenopus oocytes and eggs. EMBO 1 7, 427-434 (1988). 

17. Tabara, H., Grishok, A. & Mello, C. C. RNAi in C. elegans: soaking in the genome 
sequence. Science 282, 430-432 (1998). 

18. Bosher, J, M., Dufourcq, P., Sookhareea, S. & Labouesse, M. RNA interference can 
target pre-mRNA, Consequences for gene expression in a Caenorhabditis elegans operon. 

10 Genetics 153, 1245-1256 (1999). 

19. Hamilton, J. A. & Baulcombe, D. C. A species of small antisense RNA in 
posttranscriptional gene silencmg in plants. Science 286, 950-952 (1999). 

20. Jones, L. A., Thomas, C. L. & Maule, A. J. De novo methylation and co-suppression 
induced by a cytoplasmically replicating plant RNA virus. EMBO J, 17, 6385-6393 

15 (1998). 

21. Jones, L. A. et al. RNA-DNA interactions and DNA methylation in post- 
transcriptional gene silencing. Plcmt Cell 11, 2291-2301 (1999). 

22. Schneider, I. Cell lines derived from late embryonic stages of Drosophila 
melanogaster. 1 Embryol Exp, Morpho. 27, 353-365 (1972). 

20 23. Di Nocera, P. P. & Dawid, I. B. Transient expression of genes introduced into cultured 
cells of Drosophila. Proa Natl Acad ScL USA 80, 7095-7098 (1983). 

Example 2: Role for a bidentate ribonuclease in the initiation step of RNA 
interference 

25 Genetic approaches in worms, fungi and plants have identified a group of proteins 

that are essential for double-stranded RNA-induced gene silencing. Among these are 
ARGONAUTE family members (e.g. RDEl, QDE2f^^*^\ lecQ-family helicases (MUT-7, 
Qj)g3yi.i2^ and RNA-dependent RNA polymerases (e.g. EGO-1, QDEl, SGS2/SDE1)^^ 
While potential roles have been proposed, none of tihese genes has been assigned a 

30 defmitive function m the silencing process. Biochemical studies have suggested that 
PTGS is accomplished by a multicomponent nuclease that targets mRNAs for 
degradation^'*'^^. We have shown that the specificity of this complex may derive from the 
incorporation of a small guide sequence that is homologous to the mRNA substrate^. 
Originally identified in plants that were actively silencing transgenes', these -22 nt. RNAs 

-32" 



wo 01/68836 



PCT/USOl/08435 



have been produced during RNAi in vitro using an extract prepared from DrosopMla 
embryos^- Putative guide RNAs can also be produced in extracts from Drosophila S2 cells 
(Fig. 5a). With the goal of understanding the mechanism of post-transcriptional gene 
silencing, we have undertaken both biochemical fractionation and candidate gene 
5 ^proaches to identify the enzymes that execute each step of RNAi. 

Our previous studies resulted in the partial purification of a nuclease, RISC, that is 
an effector of RNA interference. See Example 1. This enzyme was isolated from 
Drosophila S2 cells in which RNAi had been initiated in vivo by transfection with dsRNA, 
We first sought to determine whether the RISC enzyme and the enzyme that initiates 

10 RNAi A^ia processing of dsRNA mto 22mers are distinct activities. RISC activity could be 
largely cleared from extracts by high-speed centrifugation (100,000xg for 60 min.) while 
the activity that produces 22mers remained in the supernatant (Fig. 5b,c). This simple 
fractionation indicated that RISC and the 22mer-generating activity are separable and thus 
distinct enzymes. However, it seems likely that they might interact at some point during 

15 the silencing process. 

RNAse in family members are among the few nucleases tiiat show specificity for 
double-stranded RNA^l Analysis of the Drosophila and C elegans genomes reveals 
several types of RNAse III enzymes. First is the canonical RNAse m which contains a 
single RNAse m signature motif and a double-stranded RNA binding domain (dsRBD; 

20 e.g. RNC^CAEEL). Second is a class represented by Drosha^^, a Drosophila enzyme that 
contams two RNAse HI motifs and a dsRBD (CeDrosha in C. elegans), A third class 
contains two RNAse in signatures and an amino teraiinal helicase domain (e.g. 
Drosophila CG4792, CG6493, C. elegans K12H4.8), and these had previously been 
proposed by Bass as candidate RNAi nucleases^". Representatives of all three classes 

25 were tested for the ability to produce discrete, --22 nt. RNAs from dsRNA substrates. 

Partial digestion of a 500 nt. cyclin E dsRNA with purified, bacterial RNAse m 
produced a smear of products while nearly complete digestion produced a heterogeneous 
group of -11-17 nucleotide RNAs (not shown). In order to test the dual-RNAse m 
enzymes, we prepared T7 epitope-tagged versions of Drosha and CG4792. These were 

30 expressed in transfected S2 cells and isolated by immunoprecipitation using antibody- 
agarose conjugates. Treatment of the dsRNA widi the CG4792 immunoprecipitate yielded 
-22 nt fragments similar to those produced in either S2 or embryo extracts (Fig. 6a). 
Neither activity m extract nor activity in immunoprecipitates depended on the sequence of 
the RNA substrate since dsRNAs derived from several genes were processed equivalently 

35 (see Supplement 1). Negative results were obtained with Drosha and with 
immunoprecipitates of a DExH box helicase (Homeless^'; see Fig 6a,b). Western blotting 
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confirmed that each of the tagged proteins was expressed and immunoprecipitated 
sfanilarly (see Supplement 2). Thus, we conclude that CG4792 may carry out the initiation 
step of RNA interference by producing -22 nt guide sequences from'dsRNAs. Because 
of its ability to digest dsRNA mto uniformly sized, small RNAs, we have named fliis 
5 en2yme Dicer (Dcr). Dicer mRNA is expressed in embryos, in S2 cells, and in adult flies, 
consistent with the presence of functional RNAi machmery in all of these contexts (see 
Supplement 3). 

The possibility that Dicer might be the nuclease responsible for fee production of 
guide RNAs from dsRNAs prompted us to raise an antiserum directed against the carboxy- 

10 terminus of the Dicer protein (Dicer-1, CG4792). This antiserum could 
immunoprecipitate a nuclease activity from either Drosophila embryo extracts or from S2 
cell lysates that produced -22 nt. RNAs. from dsRNA substrates (Fig. 6C). The putative 
guide RNAs that are produced by the Dicer-1 enzyme precisely comigrate with 22mers 
that are produced in extract and with 22mers that are associated with tiie RISC enzyme 

15 (Fig. 6 D,F). It had previously been shown that the euTyme that produced guide RNAs in 
Drosophila embryo extracts was ATPKIependent^ Depletion of this cofactor resulted in 
an -6-fold lower rate of dsRNA cleavage and in the production of RNAs with a slightly 
lower mobility. Of interest was the fact that both Dicer-1 immunoprecipitates and extracts 
from S2 cells require ATP for the production of --22mers (Fig. 6D). We do not observe 

20 the accumulation of lower mobility products in these cases, although we do routinely 
observe these in ATP-depIeted embryo extracts. The requfrement of this nuclease for ATP 
is a quite unusual property. We hypothesize that this requirement could indicate that the 
enzyme may act processively on llie dsRNA, with the helicase domain harnessing the 
energy of ATP hydrolysis both for unwinding guide RNAs and for translocation along the 

25 substrate. 

EflBcient uiduction of RNA interference m C. elegans and m Drosophila has 
several requirements. For example, the mitiating RNA must be double-stranded, and it 
must be several hundred nucleotides m length. To determine whether these requirements 
are dictated by Dicer, we characterized the ability of extracts and of unmunoprecipitated 

30 enzyme to digest various RNA substrates. Dicer was inactive against single stranded 
RNAs regardless of length (see Supplement 4). The enyzme could digest both 200 and 
500 nucleotide dsRNAs but was significantly less active with shorter substrates (see 
Supplement 4). Double-stranded RNAs as short as 35 nucleotides could be cut by the 
enzyme, albeit very inefficiently (data not shown). In contrast, E. coli RNAse III could 

35 digest to completion dsRNAs of 35 or 22 nucleotides (not shown). This suggests that the 
substrate preferences of the Dicer enzyme may contribute to but not wholly determine the 
size dependence of RNAi. 
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To determine whether the Dicer enzyme indeed played a role in RN Ai in vivo, we 
sought to deplete Dicer activity from S2 cells and test tiie effect on dsRNA-induced gene 
silencmg. Transfection of S2 cells with a mixture of dsKNAs homologous to the two 
Drosophila Dicer genes (CG4792 and CG6493) resulted in an -6-7 fold reduction of Dicer 
5 activity either in whole cell lysates or in Dicer-1 immunoprecipitates (Fig. 7A3). 
Transfection with a control dsRNA (murine caspase 9) had no effect. Qualitatively similar 
results were seen if Dicer was examined by Northern blotting (not shown). Depletion of 
Dicer in this manner substantially compromised the ability of cells to silence subsequentiy 
an exogenous, GFP transgene by RNAi (Fig. 7C). These results indicate that Dicer is 
10 involved m RNAi in vivo. The lack of complete inhibition of silencing could result from 
an incomplete suppression of Dicer (which is itself required for RNAi) or could indicate 
that in vivo, guide RNAs can be produced by more than one mechanism (e.g. through the 
action of RNA-dependent KNA polymerases). 

Our results indicate that the process of RNA interference can be divided into at 
15 least two distinct steps. According to this model, mitiation of PTGS would occur upon 
processing of a double-stranded RNA by Dicer into -22 nucleotide guide sequences, 
although we cannot formally exclude the possibility that anotiier, Dicer-associated 
nuclease may participate in this process. These guide RNAs would be incorporated into a 
distmct nuclease complex (RISC) that targets single-stranded mRNAs for degradation. An 
20 implication of this model is that guide sequences are themselves derived directly from the 
dsRNA that triggers the response. In accord with this model, we have demonstrated that 
^^P-labeled, exogenous dsRNAs that have been introduced into S2 cells by transfection are 
incorporated into the RISC enzyme as 22 mers (Fig. 7E). However, we cannot exclude the 
possibility that RNA-dependent RNA polymerases might amplify 22mers once they have 
25 been generated or provide an alternative mefliod for producing guide RNAs. 

The structure of the Dicer enzyme provokes speculation on tiie mechanism by 
which the enzyme might produce discretely sized fragments irrespective of the sequence 
of the dsRNA (see Supplement 1, Fig. 8a). It has been established that bacterial RNAse 
in acts on its substrate as a dimer*^*^^^. Similarly, a dimer of Dicer enzymes may be 

30 required for cleavage of dsRNAs into --22 nt. pieces. According to one model, the 
cleavage interval would be determined by the physical arrangement of the two RNAse in 
domains within Dicer enzyme (Fig. 8a). A plausible alternative model would dictate that 
cleavage was directed at a single position by the two RHI domains in a single Dicer 
protein. The 22 nucleotide interval could be dictated by interaction of neighboring Dicer 

35 enzymes or by translocation along the mRNA substrate. The presence of an mtegral 
helicase domain suggests that the products of Dicer cleavage might be single-stranded 22 
mers that are mcorporated into the RISC enzyme as such. 
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A notable feature of the Dicer family is its evolutionary conservation. Homologs 
are found in C elegans (K12H4.8), Arabidopsis (e.g., CARPEL FACTORY^^ T25K16,4, 
AC012328_1), mammals (Heiicase-MOP) and S, pombe (YC9A_SCHP0) (Fig 8b, see 
Supplements 6,7 for sequence comparisons). In fact, the human Dicer family member is 
5 capable of generating -22 nt RNAs from dsRNA substrates (Supplement 5) suggesting 
that these structurally similar proteins may all share similar biochemical functions. It has 
been demonstrated that exogenous dsRNAs can affect gene fimction in early mouse 
embryos^^ and our results suggest that this regulation may be accomplished by an 
evolutionarily conserved RNAi machinery. 

10 In addition to KNAseEI and helicase motifs, searches of the PFAM database 

indicate that each Dicer family member also contains a ZAP domam (Fig 8c)^'^. This 
sequence was defmed based solely upon its conservation ui the 
Zwille/ARGONAUTE/Piwi family that has been implicated in RNAi by mutations in C 
elegam (Rde-l/ and Nevrospora (Qde-2)**^. Although the fijnction of this domain is 

15 unknown, it is intriguing that tiiis region of homology is restricted to two gene families 
that participate in dsRNA-dependent silencing. Both the ARGONAUTE and Dicer 
families have also been implicated in conmion biological processes, namely tiie 
determination of stem-cell fates. A hypomorphic allele of carpel factory^ a member of the 
Dicer family m Arabidopsis^ is characterized by increased proliferation in floral 

20 meristems^^. This phenotype and a mmiber of other characteristic features are also shared 
by Arabidopsis ARGONAUTE (agol-1) mutants^^ (C. Kidner and R. Martiennsen, pers. 
comm.). These genetic analyses begin to provide evidence that RNAi may be more than a 
defensive response to unusual RNAs but may also play important roles in the regulation of 
endogenous genes. 

25 With the identification of Dicer as a catalyst of the initiation step of RNAi, we 

have begun to unravel the biochemical basis of this unusual mechanism of gene 
regulation. It will be of critical importance to determine whether the conserved family 
members from other organisms, particularly mammals, also play a role in dsRNA- 
mediated gene regulation. 

30 

Methods 

Plasmid constructs. A full-length cDNA encoding Drosha was obtained by PGR 
from an EST sequenced by the Berkeley Drosophila genome project. The Homeless clone 
was a gift from Gillespie and Berg (Univ. Washington). Hie T7 epitope-tag was added to 
35 the amino terminus of each by PCR^ and the tagged cDNAs were cloned into pRIP, a 
retroviral vector designed specifically for expression in insect cells (E. Bernstein, 
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unpublished). In this vector, expression is driven by the Orgyia pseudotsugata IE2 
promoter (Invitrogen). Since no cDNA was available for CG4792/Dicer, a genomic clone 
was amplified from a bacmid (BACR23F10; obtained from the BACPAC Resource Center 
in the Dept. of Human Genetics at the Roswell Park Cancer Institute). Again, during 
5 amplification, a T7 epitope tag was added at the amino terminus of the coding sequence. 
The human Dicer gene was isolated from a cDNA library prepared from HaCaT cells 
(GJH, unpublished). A TT-tagged version of the complete coding sequence was cloned 
into pCDNA3 (Invitrogen) for expression in human cells (LinX-A). 

Cell culture and extract preparation. S2 and embryo culture. S2 cells were 
10 cultured at 27**C in 5% CO2 in Schneider's insect media supplemented with 10% heat 
inactivated fetal bovine serum (Gemini) and 1% antibiotic-antimycotic solution (Gibco 
BRL). Cells were harvested for extract preparation at 10x10^ cells/ml. The cells were 
washed IX in PBS and were resuspended in a hypotonic buffer (10 mM Hepes pH 7.0, 
2mM MgC12, 6 mM pME) and dounced. Cell lysates were spun 20,000xg for 20 minutes. 
15 Extracts were stored at -80°C. Drosophila embryos were reared in fly cages by standard 
methodologies and were collected every 12 hours. The embryos were dechorionated in 
50% chlorox bleach and washed thoroughly with distilled water. Lysis buffer (lOmM 
Hepes, lOmM KCl, 1.5 mM MgCfe, 0.5mM EGTA, lOmM p-glycerophosphate, ImM 
DTT, 0.2 mM PMSF) was added to the embryos, and exfe-acts were prepared by 
20 homogenization in a tissue grinder. Lysates were spun for two hours at 200,000xg and 
were frozen at -80^C. LinX-A cells, a highly-transfectable derivative of human 293 cells, 
(Lin Xie and GJH, unpublished) were maintained m DMEM/10%FCS. 

Transfections and immunoprecipitations. S2 cells were transfected using a calcium 
phosphate procedure essentially as previously described^. Transfection rates were --90% 

25 as monitored in controls using an in situ P-galactosidase assay. LinX-A cells were also 
transfected by calcium phosphate co-precipitation. For immunoprecipitations, cells (- 
5x10^ per IP) were transfected with various clones and lysed three days later in P buffer 
(125mM KOAc, ImM MgOAc, ImM CaCb, 5mM EGTA, 20mM Hepes pH 7.0, ImM 
DTT, 1% NP-40 plus Complete protease inhibitors (Roche)). Lysates were spun for 10 

30 minutes at 14,000xg and supematants were added to T7 antibody-agarose beads 
(Novagen). Antibody binding proceeded for 4 hours at 4**C. Beads were centrifuged and 
washed in lysis buffer three times, and once hi reaction buffer. The Dicer antiserum was 
raised in rabbits using a KLH-conjugated peptide correspondmg to the C-terminal 8 ammo 
acids of Drosophila Dicer-1 (CG4792). 

35 Cleavage reactions. BNA preparation. Templates to be transcribed into dsRNA 

were generated by PCR with forward and reverse pruners, each containing a T7 promoter 
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sequence. RNAs were produced using Riboprobe (Promega) kits and were uniformly 
labeling during the transcription reaction with ^^P-UTP. Single-stranded RNAs were 
purijBed from 1% agarose gels. dsBNA cleavage. Five microliters of embryo or S2 
extracts were incubated for one hour at 30°C with dsRNA in a reaction containing 20mM 
5 Hepes pH 7.0, 2mM MgOAc, 2mM DTT, ImM ATP and 5% Superasin (Ambion). 
Immunoprecipitates were treated shnilarly except that a minimal volume of reaction buffer 
(including ATP and Superasin) and dsRNA were added to beads that had been washed in 
reaction buffer (see above). For ATP depletion, Drosophila embryo extracts were 
incubated for 20 minutes at 30°C with 2mM glucose and 0.375 U of hexokinase (Roche) 
1 0 prior to the addition of dsRNA. 

Northern and Western analysis. Total RNA was prepared from Drosophila 
embryos (0-12 hour), from adult flies, and from S2 cells using Trizol (Lifetech). 
Messenger RNA was isolated by affinity selection using magnetic oligo-dT beads (Dynal). 
RNAs were electrophoresed on denaturing formaldehyde/agarose gels, blotted and probed 

15 with randomly primed DNAs corresponding to Dicer. For Western analysis, T7-tagged 
proteins were inununoprecipitated from whole ceil lysates in IP buffer using anti-T7- 
antibody-agarose conjugates. Proteins were released from the beads by boiling in 
Laemmli buffer and were separated by electrophoresis on 8% SDS PAGE. Following 
transfer to nitrocellulose, proteins were visualized using an HRP-conjugated anti-T7 

20 antibody (Novagen) and chemiluminescent detection (Supersignal, Pierce). 

RNAi of Dicer. Drosophila S2 cells were transfected either with a dsRNA 
corresponding to mouse caspase 9 or with a mixture of two dsRNAs corresponding to 
Drosophila Dicer-1 and Dicer-2 (CG4792 and CG6493), Two days after the initial 
transfection, cells were again transfected with a mixture containing a GFP expression 
25 plasmid and either luciferase dsRNA or GFP dsRNA as previously described^. Cells were 
assayed for Dicer activity or fluorescence three days after the second transfection. 
Quantification of fluorescent cells was done on a Coulter EPICS cell sorter after fixation. 
Control transfections indicated that Dicer activity was not affected by the introduction of 
caspase 9 dsRNA, 

30 
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Example 3: A simplified method for the creation of hairpin constructs for RNA 
interference. 

In numerous model organisms, double stranded RNAs have been shown to cause 
5 effective and specific suppression of gene function (ref. 1). This response, termed RNA 
interference or post-transcriptional gene silencing, has evolved mto a highly effective 
reverse genetic tool in C elegcms, DrosopMa, plants and numerous other systems. In 
these cases, double-stranded RNAs can be introduced by uijection, transfection or feeding; 
however, in all cases, the response is both transient and systemic. Recently, stable 

10 interference wift gene expression has been achieved by expression of RNAs that form 
snap-back or hairpin structures (refs 2-7). Tliis has the potential not only to allow stable 
silencing of gene expression but also inducible silencing as has been observed m 
trypanosomes and adult Drosophila (refs 2,4,5). The utility of this approach is somewhat 
hampered by the difficulties that arise in the construction of bacterial plasmids containing 

15 the long inverted repeats that are necessary to provoke silencing. In a recent report, it was 
stated that more tlian 1,000 putative clones were screed to identify the desired construct 
(ref 7). 

The presence of hairpin structures often induces plasmid rearrangement, in part 
due to the E. coli sbc proteins that recognize and cleave cruciform DNA structures (ref 8). 

20 We have developed a method for the construction of hairpins that does not require cloning 
of inverted repeats, per se. Instead, the fragment of the gene that is to be silenced is 
cloned as a dnect repeat, and the inversion is accomplished by treatment with a site- 
specific recombinase, either in vitro (or potentially in vivo) (see Fig 29). Following 
recombination, the inverted repeat structure is stable m a bacterial strain that lacks an 

25 mtact SBC system (DL759). We have successfully used this strategy to construct 
numerous hairpin expression constructs that have been successfixlly used to provoke gene 
silencing m Drosophila cells. 
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K Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routme experimentation, many equivalents to the specific embodunents of the invention 
20 described herein. Such equivalents are intended to be encompassed by the following 
claims. 

All of the above-cited references and publications are hereby incorporated by 
reference. 
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We Claim: 

1. A method for attenuating expression of a target gene in a non-embryonic cell 
suspended in culture, comprising introducing into the cell a double stranded RNA 
(dsRNA) in an amount sufficient to attenuate expression of the target gene, 

5 wherein the dsRNA comprises a nucleotide sequence that hybridizes under 

stringent conditions to a nucleotide sequence of the target gene, 

2. A method for attenuating expression of a target gene in a mammalian cell, 
comprising 

10 (i) activating one or both of a Dicer activity or an Argonaut activity in the cell, 

and . 

(ii) introducmg into the cell a double stranded RNA (dsRNA) in an amount 
sufBcient to attenuate expression of the target gene, wherem the dsRNA 
comprises a nucleotide sequence that hybridizes under stringent conditions 
15 to a nucleotide sequence of the target gene. 

3. The method of claim 2, wherein the cell is suspended in culture. 

4. The method of claim 2, wherein the cell is in a whole anhnal, such as a non-human 
20 mammal. 



5, The method of claim 1 or 2, wherem is engineered with (i) a recombinant gene 
encoding a Dicer activity, (ii) a recombinant gene encodkig an Argonaut activity, 
or (iii) both. 

25 

6, The metiiod of claim 5, wherein the recombinant gene encodes a protein which 
mcludes an amino acid sequence at least 50 'percent identical to SEQ ID No. 2 or 4 
or the Argonaut sequence shown in Figure 24. 

30 7. The method of claim 5, wherein the recombinant gene mcludes a coding sequence 
hybridizes under wash conditions of 2 x SSC at 22*^0 to SEQ ID No. 1 or 3. 

8. The method of claun 1 or 2, wherein an endogenous Dicer gene or Argonaut gene 
is activated. 



35 



The method of claim 1 or 2, wherem the target gene is an endogenous gene of the 
cell. 
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10. The method of claim 1 or 2, wherein the target gene is an heterologous gene 
relative to the genome of the cell, such as a pathogen gene. 

5 11. The method of claim 1 or 2, wherein the cell is treated with an agent that inhibits 
protein kinase RNA-activated (PKR) apoptosis, such as by treatment with agents 
which inhibit expression of PKR, cause its destruction, and/or inhibit the kinase 
activity of PKF, 

10 12. The method of claim 1 or 2, wherein the cell is a primate cell, such as a human 
ceU. 

13. The method of claim 1 or 2, wherein the dsRNA is at least 50 nucleotides in 
length. 

15 

14. The method of claim 13, wherein the dsRNA is 400-800 nucleotides in length. 

15. The method of claim 13, wherein the dsRNA is 400-800 nucleotides in length. 

20 16. An assay for identifying nucleic acid sequences responsible for conferring a 
particular phenotype in a cell, comprising 

(i) constructing a variegated library of nucleic acid sequences from a cell in an 
orientation relative to a promoter to produce double stranded DNA; 

(ii) introducing tiie variegated dsRNA library into a culture of target cells, 
25 which cells have an activated Dicer activity or Argonaut activity; 

(iii) identifying members of the library which confer a particular phenotype on 
the cell, and identifying the sequence from a cell which correspond, such as 
being identical or homologous, to the library member. 

30 17, A method of conducting a drug discovery business comprismg: 

(i) identifying, by the assay of claim 16, a target gene which provides a 
phenotypically desurable response when inhibited by RNAi; 

(ii) identifying agents by their ability to inhibit expression of the target gene or 
the activity of an expression product of the target gene; 

35 (iii) conducting therapeutic profiling of agents identified in step (b), or further 

analogs thereof, for efficacy and toxicity m animals; and 
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(iv) formxilating a pharmaceutical preparation including one or more agents 
identified in step (iii) as having an acceptable tiierapeutic profile. 
18. The method of claun 17, including an additional step of establishing a distribution 
system for distributing the pharmaceutical preparation for sale, and may optionally 
5 mclude establishing a sales group for marketing the pharmaceutical preparation. 



19. A method of conducting a target discovery business comprising: 

(i) identifymg, by the assay of claim 16, a target gene which provides a 
phenotypically desutible response when inhibited by RNAi; 
10 (ii) (optionally) conducting therapeutic profiling, of the target gene for efficacy 

and toxicity in animals; and 
(iii). licensing, to a third party, the rights, for firther drug development of 
inhibitors of the target gene. 
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SEQUENCE LISTING 
<110> Genetica, Inc. 

<120> Methods and Compositions for RNA Interference 

<130> GCNA-pWO-007 

<140> PGT/US01/— 
<141> 2001-03-16 

<150> US 60/189.739 
<151> 2000-03-16 

<150> US 60/243,097 
<161> 2000-10-24 

<160> 4 

<170> Patentin version 3.0 

<210> 1 
<211> 5775 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> (1)..(5776) 

<400> 1 

atg aaa age cot get ttg caa ccc etc age atg gca ggc etg cag etc 48 
Met Lys Ser Pro Ala Leu Gin Pro Leu Ser Met Ala Giy Leu Gin Leu 
1 5 10 15 

atg ace cct get tec tea cea atg ggt ect ttc ttt gga etg eca tgg 96 
Met Thr Pro Ala Ser Ser Pro Met Gly Pro Phe Phe G!y Leu Pro Trp 
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20 25 30 

caa caa gaa gca att cat gat aac att tat acg cca aga aaa tat cag 144 
Gin Gin Glu Ala lie His Asp Asn lie Tyr Thr Pro Arg Lys Tyr Gin 
35 40 45 

gtt gaa ctg ctt gaa gca get ctg gat cat aat acc ate gtc tgt tta 1 92 
Val Glu Leu Leu Glu Ala Ala Leu Asp His Asn Ttir lie Val Cys Leu 
50 55 60 

aac act ggc tea ggg aag aca ttt att get agt act act eta eta aag 240 
Asn Thr Gly Ser Gly Lys Thr Phe lie Ala Ser Thr Thr Leu Leu Lys 
65 70 76 80 

age tgt etc tat eta gat eta ggg gag act tea get aga aat gga aaa 288 
Ser Cys Leu Tyr Leu Asp Leu Gly Glu Thr Ser Ala Arg Asn Gly Lys 
85 90 95 

agg.acg gtg ttc ttg gtc aac tet gca aac cag gtt get caa caa gtg 336 
Arg Thr Val Phe Leu Val Asn Ser Ala Asn Gin Val Ala Gin Gin Val . 'j 
100 105 110 

tea get gtc aga act cat tea gat etc aag gtt ggg gaa tac tea aac 384 
Ser Ala Val Arg Thr His Ser Asp Leu Lys Val Gly Glu Tyr Ser Asn 
115 120 125 

cte gaa gta aat gca tct tgg aca aaa gag aga tgg aac caa gag ttt 432 
Leu Glu Val Asn Ala Ser Trp Thr Lys Glu Arg Trp Asn Gin Glu Phe 
130 135 140 

act aag cac cag gtt etc att atg act tgc tat gtc gcc ttg aat gtt 480 
Thr Lys His Gin Val Leu lie Met Thr Cys Tyr Val Ala Leu Asn Val 
145 150 155 160 

ttg aaa aat ggt tac tta tea ctg tea gac att aac ctt ttg gtg ttt 528 
Leu Lys Asn Gly Tyr Leu Ser Leu Ser Asp lie Asn Leu Leu Val Phe 
165 170 175 
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gat gag tgt cat ctt gca ate eta gac cac ccc tat cga gaa ttt atg 576 
Asp Glu Cys His Leu Ala lie Leu Asp His Pro Tyr Arg Glu Phe Met 
180 185 190 

aag etc tgt gaa att tgt cca tea tgt ect cgc att ttg gga eta act 624 
Lys Leu Cys Glu lie Cys Pro Ser Cys Pro Arg lie Leu Gly Leu Thr 
195 200 205 

get tec att tta aat ggg aaa tgg gat cca gag gat ttg gaa gaa aag 672 
A!a Ser lie Leu Asn Gly Lys Trp Asp Pro Glu Asp Leu Glu Glu Lys 
210 215 220 

ttt cag aaa eta gag aaa att ctt aag agt aat get gaa act gca act 720 
Phe Gin Lys Leu G!u Lys lie Leu Lys Ser Asn Ala Glu Thr Ala Thr 
225 230 235 240 

gac ctg gtg gtc tta gac agg tat act tct cag cca tgt gag att gtg 768 
Asp Leu Val Val Leu Asp Arg Tyr Thr Ser Gin Pro Cys Glu lie Val 
245 250 255 

gtg gat tgt gga cca ttt act gac aga agt ggg ctt tat gaa aga ctg 816 
Val Asp Cys Gly Pro Phe Thr Asp Arg Ser Gly Leu Tyr Glu Arg Leu 
260 265 270 

ctg atg gaa tta gaa gaa gca ctt aat ttt ate aat gat tgt aat ata 864 
Leu Met Glu Leu Glu Glu Ala Leu Asn Phe lie Asn Asp Cys Asn He 
275 280 285 

tct gta cat tea aaa gaa aga gat tct act tta att teg aaa cag ata 912 
Ser Val His Ser Lys Glu Arg Asp Ser Thr Leu lie Ser Lys Gin lie 
290 295 300 

eta tea gac tgt cgt gee gta ttg gta gtt ctg gga ccc tgg tgt gca 960 
Leu Ser Asp Cys Arg Ala Val Leu Val Val Leu Gly Pro Trp Cys Ala 
305 310 315 320 
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gat aaa gta get gga atg atg gta aga gaa eta cag aaa tac ate aaa 1 008 
Asp Lys Val Ala Gly Met Met Val Arg G!u Leu Gin Lys Tyr lie Lys 
325 330 335 

cat gag caa gag gag ctg cac agg aaa ttt tta ttg ttt aca gac act 1056 
His Glu Gin Glu Glu Leu His Arg Lys Phe Leu Leu Phe Thr Asp Thr 
340 345 350 

ttc Ota agg aaa ata cat gca eta tgt gaa gag cac tie tea cct gee 1 104 
Phe Leu Arg Lys lie His Ala Leu Cys Glu Glu His Phe Ser Pro Ala 
355 360 365 

tea ctt gac ctg aaa ttt gta act cct aaa gta ate aaa ctg etc gaa 11 52 
Ser Leu Asp Leu Lys Phe Val Thr Pro Lys Val lie Lys Leu Leu Glu 
370 375 380 

ate tta cgc aaa tat aaa cca tat gag ega cac agt ttt gaa age gtt 1 200 
lie Leu Arg Lys Tyr Lys Pro Tyr Glu Arg His Ser Phe Glu Ser Val 
385 390 395 400 

gag tgg tat aat aat aga aat eag gat aat tat gtg tea tgg agt gat 1248 
Glu Trp Tyr Asn Apn Arg Asn Gin Asp Asn Tyr Val Ser Trp Ser Asp 
405 410 415 

tct gag gat gat gat gag gat gaa gaa att gaa gaa aaa gag aag cea 1296 
Ser Glu Asp Asp Asp Glu Asp Glu Glu lie Glu Glu Lys Glu Lys Pro 
420 425 430 

gag aca aat ttt cct tct cct ttt ace aac att ttg tge gga att att 1 344 
Glu Thr Asn Phe Pro Ser Pro Phe Thr Asn lie Leu Cys Gly lie lie 
435 440 445 

ttt gtg gaa aga aga tac aca gca gtt gtc tta aac aga ttg ata aag 1392 
Phe Val Glu Arg Arg Tyr Thr Ala Val Val Leu Asn Arg Leu lie Lys 
450 455 460 

gaa get ggc aaa caa gat cca gag ctg get tat ate agt age aat ttc 1440 
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Glu Ala Gly Lys Gin Asp Pro Glu Leu Ala Tyr lie Ser Ser Asn Phe 
465 470 475 480 

ata act gga cat ggc att ggg aag aat cag cct cgc aac aac acg atg 1488 
lie Thr Gly His Gly lie Gly Lys Asn Gin Pro Arg Asn Asn Thr Met 
485 490 495 

gaa gca gaa ttc aga aaa cag gaa gag gta ctt agg aaa ttt cga gca 1536 
Glu Ala Glu Phe Arg Lys Gin Glu Glu Val Leu Arg Lys Phe Arg Ala 
500 505 510 

cat gag acc aac ctg ctt att gca aca agt att gta gaa gag ggt gtt 1 584 
His Glu Thr Asn Leu Leu lie Ala Thr Ser lie Val Glu Glu Gly Val 
515 620 525 

gat ata cca aaa tgc aac ttg gtg gtt cgt ttt gat ttg ccc aca gaa 1 632 
Asp He Pro Lys Cys Asn Leu Val Val Arg Phe Asp Leu Pro Thr Glu 
530 535 540 

tat cga toe tat gtt caa tct aaa gga aga gca agg gca ccc ate tct 1680 
Tyr Arg Ser Tyr Val Gin Ser Lys Gly Arg Ala Arg Ala Pro lie Ser 
545 650 555 560 

aat tat ata atg tta gcg gat aca gao aaa ata aaa agt ttt gaa gaa 1728 
Asn Tyr lie Met Leu Ala Asp Thr Asp Lys He Lys Ser Phe Glu Glu 
565 670 575 

gac ctt aaa acc tac aaa get att gaa aag ate ttg aga aac aag tgt 1776 
Asp Leu Lys Thr Tyr Lys Ala lie Glu Lys lie Leu Arg Asn Lys Cys 
680 685 690 

tee aag teg gtt gat act ggt gag act gae att gat cct gte atg gat 1824 
Ser Lys Ser Val Asp Thr Gly Glu Thr Asp lie Asp Pro Val Met Asp 
595 600 605 

gat gat cac gtt ttc cca cea tat gtg ttg agg cct gac gat ggt ggt 1872 
Asp Asp His Val Phe Pro Pro Tyr Val Leu Arg Pro Asp Asp Giy Gty 
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610 615 620 

cca cga gtc aca ate aac acg gcc att gga cac ate aat aga tae tgt 1920 
Pro Arg Val Thr He Asn Thr Ala lie Gly His ile Asn Arg Tyr Cys 
625 630 635 640 

get aga tta cca agt gat ccg ttt act cat eta get cct aaa tgc aga 1 968 
Ala Arg Leu Pro Ser Asp Pro Phe Thr His Leu Ala Pro Lys Cys Arg 
646 650 655 

acc cga gag ttg cct gat ggt aca ttt tat tea act ctt tat ctg cca 201 6 
Thr Arg Glu Leu Pro Asp Gly Thr Phe Tyr Ser Thr Leu Tyr Leu Pro 
660 665 670 

att aac tea cct ctt cga gcc tec att gtt ggt cca cca atg age tgt 2064 
lie Asn Ser Pro Leu Arg Ala Ser lie Val Gly Pro Pro Met Ser Cys 
675 680 685 

gta cga ttg get gaa aga gtt gtc get etc att tgc tgt gag aaa ctg 2112 
Val Arg Leu Ala Glu Arg Val Val Ala Leu Ile Cys Cys Glu Lys Leu 
B90 695 700 

cac aaa att ggc gaa ctg gat gac cat ttg atg cca gtt ggg aaa gag 21 60 
His Lys ile Gly Glu Leu Asp Asp His Leu Met Pro Val Gly Lys Glu 
705 710 715 720 

act gtt aaa tat gaa gag gag ctt gat ttg cat gat gaa gaa gag acc 2208 
Thr Val Lys Tyr Glu Glu Glu Leu Asp Leu His Asp Glu Glu Glu Thr 
725 730 735 

agt gtt cca gga aga cca ggt tec acg aaa cga agg cag tgc tae cca 2256 
Ser Val Pro Gly Arg Pro Gly Ser Thr Lys Arg Arg Gin Cys Tyr Pro 
740 745 750 

aaa gca att cca gag tgt ttg agg gat agt tat cce aga cct gat cag 2304 
Lys Ala lie Pro Glu Cys Leu Arg Asp Ser Tyr Pro Aug Pro Asp Gin 
755 760 765 
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ccc tgt tac ctg tat gtg ata gga atg gtt tta act aca cct tta cct 2352 
Pro Cys Tyr Leu Tyr Val lie Gly Met Val Leu Thr Thr Pro Leu Pro 
770 775 780 

gat gaa etc aac ttt aga agg egg aag etc tat cct cct gaa gat ace 2400 
Asp Glu Leu Asn Phe Arg Arg Arg Lys Leu Tyr Pro Pro Giu Asp Thr 
785 790 795 800 

aca aga tgc ttt gga ata ctg acg gcc aaa ccc ata cct cag att cca 2448 
Thr Arg Cys Phe Gly lie Leu Thr Ala Lys Pro He Pro Gin He Pro 
805 810 815 

cac ttt cct gtg tac aca cgc tct gga gag gtt acc ata tec att gag 2496 
His Phe Pro Val Tyr Thr Arg Ser Gly Glu Val Thr He Ser He Glu 
820 825 830 

ttg aag aag tct ggt ttc atg ttg tct eta caa atg ctt gag ttg att 2544 
Leu Lys Lys Ser Gly Phe Met Leu Ser Leu Gin Met Leu Glu Leu lie 
835 840 845 

aca aga ctt cac cag tat ata ttc tea cat att ctt egg ctt gaa aaa 2592 
Thr Arg Leu His Gin Tyr He Phe Ser His He Leu Arg Leu Glu Lys 
850 855 860 

cct gea eta gaa ttt aaa cct aca gac get gat tea gca tac tgt gtt 2640 
Pro Ala Leu Glu Phe Lys Pro Thr Asp Ala Asp Ser Ala Tyr Cys Val 
865 870 876 880 

eta cct ctt aat gtt gtt aat gac tec age act ttg gat att gac ttt 2688 
Leu Pro Leu Asn Val Val Asn Asp Ser Ser Thr Leu Asp He Asp Phe 
885 890 895 

aaa ttc atg gaa gat att gag aag tct gaa get cgc ata ggc att ccc 2736 
Lys Phe Met Glu Asp He Glu Lys Ser Glu Ala Arg He Gly He Pro 
900 905 910 
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agt aca aag tat aca aaa gaa aca ccc ttt gtt ttt aaa tta gaa gat 2784 
Ser Thr Lys Tyr Thr Lys Glu Thr Pro Phe Val Phe Lys Leu Glu Asp 
915 920 925 

tac caa gat gcc gtt ate att cca aga tat cgc aat ttt gat cag cct 2832 
Tyr Gin Asp Ala Val lie lie Pro Arg Tyr Arg Asn Phe Asp Gin Pro 
930 935 940 

cat cga ttt tat gta get gat gtg tac act gat ctt ace cca etc agt 2880 
His Arg Phe Tyr Val Ala Asp Val Tyr Thr Asp Leu Thr Pro Leu Ser 
945 950 955 960 

aaa ttt cct tec cct gag tat gaa act ttt gca gaa tat tat aaa aca 2928 
Lys Phe Pro Ser Pro Glu Tyr Glu Thr Phe Ala Glu Tyr Tyr Lys Thr 
965 970 975 

aag tac aac ctt gac eta ace aat etc aac cag cca ctg ctg gat gtg 2976 
Lys Tyr Asn Leu Asp Leu Thr Asn Leu Asn Gin Pro Leu Leu Asp Val 
980 985 990 

gac cac aca tct tea aga ctt aat ctt ttg aca cct cga cat ttg aat 3024 
Asp His Thr Ser Ser Arg Leu Asn Leu Leu Thr Pro Arg Hfs Leu Asn 
995 1000 1005 

cag aag ggg aaa gcg ctt cct tta age agt get gag aag agg aaa 3069 
G!n Lys Gly Lys Ala Leu Pro Leu Ser Ser Ala Glu Lys Arg Lys 
1010 1015 1020 

gcc aaa tgg gaa agt ctg cag aat aaa cag ata ctg gtt cca gaa 31 14 
Ala Lys Trp Glu Ser Leu Gin Asn Lys Gin lie Leu Val Pro Glu 
1025 1030 1035 

etc tgt get ata cat cca att cca gca tea ctg tgg aga aaa get 31 59 
Leu Cys Ala He His Pro He Pro Ala Ser Leu Trp Arg Lys Ala 
1040 1045 1050 

gtt tgt etc ccc age ata ctt tat cgc ctt cac tgc ctt ttg act 3204 
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Val Cys Leu Pro Ser lie Leu Tyr Arg Leu His Cys Leu Leu Thr 
1055 1060 1065 

gca gag gag eta aga gcc cag act gcc age gat get ggc gtg gga 3249 
Ala Glu Glu Leu Arg Ala Gin Thr Ala Ser Asp Ala Gly Val Gly 
1070 1075 1080 

gtc aga tea ctt cct gcg gat ttt aga tac cct aac tta gae ttc 3294 
Val Arg Ser Leu Pro Ala Asp Phe Arg Tyr Pro Asn Leu Asp Phe 
1085 1090 1095 

ggg tgg aaa aaa tet att gae age aaa tct ttc ate tea att tet 3339 
GlyTrp Lys Lys Ser lie Asp Ser Lys Ser Phe lie Ser lie Ser 
1100 1105 1110 

aac tec tct tea get gaa aat gat aat tac tgt aag cac age aca 3384 
Asn Ser Ser Ser Ala Glu Asn Asp Asn Tyr Cys Lys His Ser Thr 
1115 1120 1125 

att gtc cct gaa aat get gca cat caa ggt get aat aga ace tec 3429 
lie Val Pro Glu Asn Ala Ala His Gin Gly Ala Asn Arg Thr Ser 
1130 1136 1140 

tct eta gaa aat cat gae caa atg tct gtg aac tge agaaegttg 3474 
Ser Leu Glu Asn His Asp Gin Met Ser Val Asn Cys Arg Thr Leu 
1145 1150 1155 

etc age gag tec cct ggt aag etc cac gtt gaa gtt tea gca gat 3519 
Leu Ser Glu Ser Pro Gly Lys Leu His Val Glu Val Ser Ala Asp 
1160 1165 1170 

ctt aca gca att aat ggt ctt tct tac aat caa aat etc gcc aat 3564 
Leu Thr Ala He Asn Gly Leu Ser Tyr Asn Gin Asn Leu Ala Asn 
1175 1180 1185 

ggc agt tat gat tta get aac aga gae ttt tgc caa gga aat cag 3609 
Gly Ser Tyr Asp Leu Ala Asn Arg Asp Phe Cys Gin Gly Asn Gin 
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1190 1195 1200 

eta aat tac tac aag cag gaa ata ccc gtg caa cca act acc tea 3654 
Leu Asn Tyr Tyr Lys Gin Glu lie Pro Val Gin Pro Thr Thr Ser 
1205 1210 1215 

tat tec att cag aat tta tac agt tac gag aac cag ccc cag ccc 3699 
Tyr Ser He Gin Asn Leu Tyr Ser Tyr Glu Asn Gin Pro Gin Pro 
1220 1225 1230 

age gat gaa tgt act etc ctg agt aat aaa tac ctt gat gga aat 3744 
Ser Asp Glu Cys Thr Leu Leu Ser Asn Lys Tyr Leu AspGlyAsn 
1235 1240 1245 

get aac aaa tct aec tea gat gga agt ect gtg atg gee gta atg 3789 
Ala Asn Lys Ser Thr Ser Asp Gly Ser Pro Val Met Ala Va! Met 
1250 1255 1260 

cct ggt acg aca gac act att caa gtg etc aag ggc agg atg gat 3834 
Pro Gly Thr Thr Asp Thr lie Gin Val Leu Lys Gly Arg Met Asp 
1265 1270 1275 

tct gag cag age ect tct att ggg tac tec tea agg act ctt ggc 3879 
Ser Glu Gin Ser Pro Ser lie Gly Tyr Ser Ser Arg Thr Leu Gly 
1280 1285 1290 

ccc aat cct gga ctt att ctt cag get ttg act ctg tea aac get 3924 
Pro Asn Pro Gly Leu He Leu Gin Ala Leu Thr Leu Ser Asn Ala 
1295 1300 1305 

agt gat gga ttt aac ctg gag egg ctt gaa atg ctt ggc gac tec 3969 
Ser Asp Gly Phe Asn Leu Glu Arg Leu Glu Met Leu Gly Asp Ser 
1310 1315 1320 

ttt tta aag eat gee ate acc aca tat eta ttt tgc act tac cct 4014 
Phe Leu Lys His Ala lie Thr Thr Tyr Leu Phe Cys Thr Tyr Pro 
1325 1330 1335 
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gat gcg cat gag ggc cgc ctt tea tat atg aga age aaa aag gtc 4059 
Asp Ala His Glu Gly Arg Leu Ser Tyr Met Arg Ser Lys Lys Val 
1340 1345 1350 

age aac tgt aat ctg tat cge ctt gga aaa aag aag gga eta eec 4104 
Ser Asn Cys Asn Leu Tyr Arg Leu Gly Lys Lys Lys Gly Leu Pro 
1355 1360 1365 

age cgc atg gtg gtg tea ata ttt gat ccc cot gtg aat tgg ctt 4149 
Ser Arg Met Val Val Ser lie Phe Asp Pro Pro Val Asn Trp Leu 
1370 1375 1380 

cct cct ggt tat gta gta aat caa gac aaa age aac aca gat aaa 4194 
Pro Pro Gly Tyr Val Val Asn Gin Asp Lys Ser Asn ThrAspLys 
1385 1390 1395 

tgg gaa aaa gat gaa atg aca aaa gac tgc atg ctg gcg aat ggc 4239 
Trp Glu Lys Asp Glu Met Thr Lys Asp Cys Met Leu Ala Asn Gly 
1400 1405 1410 

aaa ctg gat gag gat tac gag gag gag gat gag gag gag gag age 4284 
Lys Leu Asp Glu Asp Tyr Glu Glu Glu Asp Glu Glu Glu Glu Ser 
1415 1420 1426 

ctg atg tgg agg get ccg aag gaa gag get gac tat gaa gat gat 4329 
Leu Met Trp Arg Ala Pro Lys Glu Glu Ala Asp Tyr Glu Asp Asp 
1430 1435 1440 

ttcctg gag tat gat cag gaa cat ate aga ttt ata gat aat atg 4374 
Phe Leu Glu Tyr Asp Gin Glu His lie Arg Phe lie Asp Asn Met 
1445 1450 1455 

tta atg ggg tea gga get ttt gta aag aaa ate tet ctt tct cct 441 9 
Leu Met Gly Ser Gly Ala Phe Val Lys Lys lie Ser Leu Ser Pro 
1460 1465 1470 
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ttt tea acc act gat tct gca tat gaa tgg aaa atg ccc aaa aaa 4464 
Phe Ser Thr Thr Asp Ser Ala Tyr Glu Trp Lys Met Pro Lys Lys 
1475 1480 1485 

tec tec tta ggt agt atg cca ttt tea tea gat ttt gag gat ttt 4509 
Ser Ser Leu Gly Ser Met Pro Phe Ser Ser Asp Phe Glu Asp Phe 
1490 1495 1500 

gac tac age tct tgg gat gca atg tge tat ctg gat cct age aaa 4554 
Asp Tyr Ser Ser Trp Asp Ala Met Cys Tyr Leu Asp Pro Ser Lys 
1505 1510 1515 

get gtt gaa gaa gat gae ttt gtg gtg ggg ttc tgg eat cca tea 4599 
Ala Val Glu Glu Asp Asp Phe Val Val Gly Phe Trp Asn Pro Ser 
1620 1526 1630 

gaa gaa aae tgt ggt gtt gae acg gga aag eag tec att tot tac 4644 
Glu Glu Asn Cys Gly Val Asp Thr Gly Lys Gin Ser lie Ser Tyr 
1535 1540 1545 

gac ttg cac act gag cag tgt att get gac aaa ago ata gcg gac 4689 
Asp Leu His Thr Glu Gin Cys lie Ala Asp Lys Ser lie Ala Asp 
1650 1555 1560 

tgt gtg gaa gee ctg ctg ggc tgc tat tta acc age tgt ggg gag 4734 
Cys Val Glu Ala Leu Leu Gly Cys Tyr Leu Thr Ser Cys Gly Glu 
1565 1570 1575 

399 9ct get eag ett ttc etc tgt tea ctg ggg ctg aag gtg etc 4779 
Arg Ala Ala Gin Leu Phe Leu Cys Set Leu Gly Leu Lys Val Leu 
1580 1585 1590 

ccg gta att aaa agg act gat egg gaa aag gee ctg tgc cct act 4824 
Pro Val lie Lys Arg Thr Asp Apg Glu Lys Ala Leu Cys Pro Thr 
1595 ' 1600 1605 

egg gag eat ttc aac age caa caa aag aac ett tea gtg age tgt 4869 
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Arg Glu Asn Phe Asn Ser Gin Gin Lys Asn Leu Ser Val Ser Cys 
1610 1615 1620 

get get get tct gtg gcc agt tea cgc tct tct gta ttgaaagac 4914 
Ala Ala Ala Ser Val Ala Ser Ser Arg Ser Ser Val Leu Lys Asp 
1625 1630 1635 

teg gaa tat ggt tgt ttg aag att cca cea aga tgt atg ttt gat 4959 
Ser Glu Tyr Gly Cys Leu Lys lie Pro Pro Aip Cys Met Phe Asp 
1640 1645 1650 

cat cca gat gca gat aaa aca ctg aat cac ott ata teg ggg ttt 5004 
His Pro Asp Ala Asp Lys Thr Leu Asn His Leu lie Ser Gly Phe 
1655 1660 1665 

gaa aat ttt gaa aag aaa ate aac tac aga tie aag aat aag get 5049 
Glu Asn Phe Glu Lys Lys lie Asn Tyr Arg Phe Lys Asri Lys Ala 
1670 1675 1680 

tac ott cte cag get ttt aca cat gcc tec tac cac tac aat act . 5094 . 
Tyr Leu Leu Gin Ala Phe Thr His Ala Ser Tyr His. Tyr Asn Thr 
1685 1690 1695 

ate act gat tgt tac cag cgc tta gaa ttc ctg gga gat gcg att 5139 
lie Thr Asp Cys Tyr Gin Arg Leu Glu Phe Leu Gly Asp Ala lie 
1700 1705 1710 

ttg gac tac etc ata acc aag cac ctt tat gaa gac ceg egg cag 5184 
Leu Asp Tyr Leu lie Thr Lys His Leu Tyr Glu Asp Pro Arg Gin 
1715 1720 1725 

cac tee ceg ggg gtc ctg aca gac ctg egg tct gcc ctg gte aac 6229 
His Ser Pro Gly Val Leu Thr Asp Leu Arg Ser Ala Leu Val Asn 
1730 ' 1735 1740 

aac acc ate ttt gca teg ctg get gta aag tac gac tac cac aag 5274 
Asn Thr lie Phe Ala Ser Leu Ala Val Lys Tyr Asp Tyr His Lys 
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1745 1750 1755 

tac ttc aaa get gtc tct cct gag etc ttc cat gtc att gat gac 531 9 
Tyr Phe Lys Ala Val Ser Pro Glu Leu Phe His Val lie Asp Asp 
1760 1765 1770 

ttt gtg cag ttt cag ctt gag aag aat gaa atg caa gga atg gat 5364 
Phe Val Gin Phe Gin Leu Glu Lys Asn Glu Met Gin Gly Met Asp 
1775 1780 1785 

tct gag ctt agg aga tct gag gag gat gaa gag aaa gaa gag gat 5409 
Ser Glu Leu Arg Arg Ser Glu Glu Asp Glu Glu Lys Glu Glu Asp 
1790 1795 1800 

att gaa gtt cca aag gcc atg ggg gat att ttt gag teg ctt get 5454 
lie Glu Val Pro Lys Ala Met Gly Asp He Phe Glu Ser Leu Ala 
1805 1810 1815 

ggt gcc att tac atg gat agt ggg atg tea ctg gag aca gtc tgg 5499 
Gly Ala lie Tyr Met Asp Ser Gly Met Ser Leu Glu ThrValTrp 
1820 1826 1830 

cag gtg tac tat ccc atg atg egg cca eta ata gaa aag ttt tct 5544 
Gin Val Tyr Tyr Pro Met Met Arg Pro Leu lie Glu Lys Phe Ser 
1835 1840 1845 

gca aat gta cce cgt tec cct gtg cga gaa ttg ctt gaa atg gaa 5589 
Ala Asn Val Pro Arg Ser Pro Val Arg Glu Leu Leu Glu Met Glu 
1850 1855 1860 

cea gaa act gcc aaa ttt age cog get gag aga act tac gac ggg 5634 
Pro Glu Thr Ala Lys Phe Ser Pro Ala Glu Arg Thr Tyr Asp Gly 
1865 1870 1875 

aag gtc aga gtc act gtg gaa gta gta gga aag ggg aaa ttt aaa . 5679 
Lys Val Arg Val Thr Val Glu Val Val Gly Lys Gly Lys Phe Lys 
1880 1885 1690 
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ggt gtt ggt cga agt tac agg att gcc aaa tct gca gca gca aga 5724 
Giy Val Gly Arg Ser Tyr Arg lie Ala Lys Ser Ala Ala Ala Arg 
1895 1900 1905 

aga gcc etc cga age etc aaa get aat caa cct eag gtt ccc aat 5769 
Arg Ala Leu Arg Ser Leu Lys Ala Asn Gin Pro Gin Val Pro Asn 
1910 1915 1920 

age tga 5775 
Ser 



<210> 2 
<211> 1924 
<212> PRT 
<213> Homo sapiens 

<400> 2 

Met Lys Ser Pro Ala Leu Gin Pro Leu Ser Met Ala Gly Leu Gin Leu 
15 10 15 



Met Thr Pro Ala Ser Ser Pro Met Gfy Pro Phe Phe Giy Leu Pro Trp 
20 26 30 



Gin Gin Glu Ala lie His Asp Asn He Tyr Thr Pro Arg Lys Tyr Gin 
35 40 45 



Val Glu Leu Leu Glu Ala Ala Leu Asp His Asn Thr lie Val Cys Leu 
50 55 60 
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Asn Thr Gly Ser Gly Lys Thr Phe lie Ala Ser Thr Thr Leu Leu Lys 
65 70 75 80 



Ser Cys Leu Tyr Leu Asp Leu Gly Glu Thr Ser Ala Arg Asn Gly Lys 
85 90 95 



Arg Thr Val Phe Leu Val Asn Ser Ala Asn Gin Val Ala Gin Gin Val 
100 105 110 



Ser Ala Val Arg Thr His Ser Asp Leu Lys Val Gly Glu Tyr Ser Asn 
115 120 125 



Leu Glu Val Asn Ala Ser Trp Thr Lys Glu Arg Trp Asn Gin Glu Phe 
130 135 140 

Thr Lys His Gin Val Leu lie Met Thr Cys Tyr Vai Ala Leu Asn Va! 
145 150 155 160 



Leu Lys Asn Gly Tyr Leu Ser Leu Ser Asp lie Asn Leu Leu Val Phe 
166 170 175 



Asp Glu Cys His Leu Ala lie Leu Asp His Pro Tyr Arg Glu Phe Met 
180 185 190 



Lys Leu Cys Glu lie Cys Pro Ser Cys Pro Arg lie Leu Gly Leu Thr 
195 200 205 



Ala Ser lie Leu Asn Gly Lys Trp Asp Pro Glu Asp Leu Glu Glu Lys 
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210 215 220 



Phe Gin Lys Leu Glu Lys lie Leu Lys Ser Asn Ala Giu Thr Ala Thr 
225 230 235 240 



Asp Leu Val Val Leu Asp Arg Tyr Thr Ser Gin Pro Cys Glu lie Val 
245 250 255 



Val Asp Cys Gly Pro Phe Thr Asp Arg Ser Gly Leu Tyr Glu Arg Leu 
260 265 270 



Leu Met Glu Leu Giu Glu Ala Leu Asn Phe lie Asn Asp Cys Asn lie 
275 280 285 



Ser Val His Ser Lys Glu Arg Asp Ser Thr Leu lie Ser Lys Gin lie 
290 295 300 



Leu Ser Asp Cys Arg Ala Val Leu Val Val Leu Gly Pro Trp Cys Ala 
305 310 315 320 



Asp Lys Val Ala Gly Met Met Val Arg Glu Leu Gin Lys Tyr lie Lys 
325 330 335 



His Glu Gin Glu Glu Leu His Arg Lys Phe Leu Leu Phe Thr Asp Thr 
340 345 350 



Phe Leu Arg Lys lie His Ala Leu Cys Giu Glu His Phe Ser Pro Ala 
355 360 365 
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Ser Leu Asp Leu Lys Phe Val Thr Pro Lys Val lie Lys Leu Leu Glu 
370 375 380 



lie Leu Arg Lys Tyr Lys Pro Tyr Glu Arg His Ser Phe Glu Ser Val 
385 390 395 400 



Glu Trp Tyr Asn Asn Arg Asn Gin Asp Asn Tyr Val Ser Trp Ser Asp 
405 410 415 



Ser Glu Asp Asp Asp Glu Asp Glu Glu lie Glu Glu Lys Glu Lys Pro 
420 425 430 



Glu Thr Asn Phe Pro Ser Pro Phe Thr Asn lie Leu Cys Gly He lie 
435 440 445 



Phe Val Glu Arg Arg Tyr Thr Ala Val Val Leu Asn Arg Leu lie Lys 
460 455 460 



Glu Ala Gly Lys Gin Asp Pro Glu Leu Ala Tyr lie Ser Ser Asn Phe 
465 470 476 480 



lie Thr Gly His Gly He Gly Lys Asn Gin Pro Arg Asn Asn Thr Met 
485 490 495 



Glu Ala Glu Phe Arg Lys Gin Glu Glu Val Leu Arg Lys Phe Arg Ala 
500 505 510 
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His Glu Thr Asn Leu Leu lie Ala Thr Ser lie Val GIu GIu Gly Val 
515 520 525 



Asp He Pro Lys Cys Asn Leu Val Val Arg Phe Asp Leu Pro Thr GIu 
530 535 540 



Tyr Arg Ser Tyr Val Gin Ser Lys Gly Arg Ala Arg Ala Pro He Ser 
545 660 655 660 



Asn Tyr lie Met Leu Ala Asp Thr Asp Lys He Lys Ser Phe GIu GIu 
565 570 576 



Asp Leu Lys Thr Tyr Lys Ala He GIu Lys We Leu Arg Asn Lys Cys 
680 585 690 



Ser Lys Ser Val Asp Thr Gly GIu Thr Asp lie Asp Pro Val Met Asp 
695 600 605 



Asp Asp His Val Phe Pro Pro Tyr Val Leu Arg Pro Asp Asp Gly Gly 
610 615 620 



Pro Arg Val Thr lie Asn Thr Ala He Gly His He Asn Arg Tyr Cys 
625 630 635 640 



Ala Arg Leu Pro Ser Asp Pro Phe Thr His Leu Ala Pro Lys Cys Arg 
645 650 655 
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Thr Arg Glu Leu Pro Asp Gly Thr Phe Tyr Ser Thr Leu Tyr Leu Pro 
660 665 670 



lie Asn Ser Pro Leu Arg Ala Ser lie Val Gly Pro Pro Met Ser Cys 
675 680 685 



Val Arg Leu Ala Glu Arg Val Val Ala Leu lie Cys Cys Glu Lys Leu 
690 695 700 



His Lys He Gly Glu Leu Asp Asp His Leu Met Pro Val Gly Lys Glu 
705 710 715 720 



Thr Val Lys Tyr Glu Glu Glu Leu Asp Leu His Asp Glu Glu Glu Thr 
725 730 735 



Ser Val Pro Gly Arg Pro Gly Ser Thr Lys Arg Arg Gin Cys Tyr Pro 
740 745 750 



Lys Ala lie Pro Glu Cys Leu Arg Asp Ser Tyr Pro Arg Pro Asp Gin 
755 760 765 



Pro Cys Tyr Leu Tyr Val lie Gly Met Val Leu Thr Thr Pro Leu Pro 
770 775 780 



Asp Glu Leu Asn Phe Arg Arg Arg Lys Leu Tyr Pro Pro Glu Asp Thr 
785 790 795 800 



Thr Arg Cys Phe Gly lie Leu Thr Ala Lys Pro He Pro Gin lie Pro 
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805 810 815 



His Phe Pro Val Tyr Thr Arg Sen Gly Glu Val Thr lie Ser lie Giu 
820 825 830 



Leu Lys Lys Ser Gly Phe Met Leu Ser Leu Gin Met Leu Glu Leu lie 
835 840 845 



Thr Arg Leu His Gin Tyr lie Phe Ser His He Leu Arg Leu Glu Lys 
850 855 860 



Pro Ala Leu Glu Phe Lys Pro Thr Asp Ala Asp Ser Ala Tyr Cys Val 
865 870 876 880 



Leu Pro Leu Asn Val Val Asn Asp Ser Ser Thr Leu Asp He Asp Phe 
885 890 '895 



Lys Phe Met Glu Asp lie Glu Lys Ser Glu Ala Arg lie Gly lie Pro 
900 905 910 



Ser Thr Lys Tyr Thr Lys Glu Thr Pro Phe Val Phe Lys Leu Glu Asp 
915 920 925 



Tyr Gin Asp Ala Val lie He Pro Arg Tyr Arg Asn Phe Asp Gin Pro 
930 935 940 



His Arg Phe Tyr Val Ala Asp Val Tyr Thr Asp Leu Thr Pro Leu Ser 
946 950 955 960 
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Lys Phe Pro Ser Pro Glu Tyr Glu Thr Phe Ala Glu Tyr Tyr Lys Thr 
965 970 975 



Lys Tyr Asn Leu Asp Leu Thr Asn Leu Asn G(n Pro Leu Leu Asp Val 
980 985 990 



Asp His Thr Ser Ser Arg Leu Asn Leu Leu Thr Pro Arg His Leu Asn 
995 1000 1005 



G(n Lys G\y Lys Ala Leu Pro Leu Ser Ser Afa Glu Lys Arg Lys 
1010 1015 1020 



Ala Lys Trp Glu Ser Leu Gin Asn Lys Gin lie Leu Val Pro Glu 
1025 1030 1035 



Leu Cys Aia He His Pro lie Pro Ala Ser Leu Trp Arg Lys Ala 
1040 1045 1050 



Val Cys Leu Pro Ser lie Leu Tyr Arg Leu His Cys Leu Leu Thr 
1055 1060 1065 



Ala Glu Glu Leu Arg Ala Gin Thr Ala Ser Asp Ala Gly Val Gly 
1070 1076 1080 



Val Arg Ser Leu Pro Ala Asp Phe Arg Tyr Pro Asn Leu Asp Phe 
1086 1090 1095 
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Gly Trp Lys Lys Ser lie Asp Ser Lys Ser Phe tie Ser lie Ser 
1100 1105 1110 



Asn Ser Ser Ser Ala Glu Asn Asp Asn Tyr Cys Lys His Ser Thr 
1115 1120 1125 



lie Val Pro Glu Asn Ala Ala His Gin Gly Ala Asn Arg Thr Ser 
1130 . 1135 1140 



Ser Leu Glu Asn His Asp Gin Met Ser Val Asn Cys Arg Thr Leu 
1145 1150 1155 



Leu Ser Glu Ser Pro Gly Lys Leu His Val Glu Val Ser Ala Asp 
1160 1165 1170 



Leu Thr Ala lie Asn Gly Leu Ser Tyr Asn Gin Asn Leu Ala Asn 
1175 1180 1185 



Gly Ser Tyr Asp Leu Ala Asn Arg Asp Phe Cys Gin Gly Asn Gin 
1190 1195 1200 



Leu Asn Tyr Tyr Lys Gin Glu He Pro Val Gin Pro Thr Thr Ser 
1205 1210 1215 



Tyr Ser lie Gin Asn Leu Tyr Ser Tyr Glu Asn Gin Pro Gin Pro 
1220 1225 1230 
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Ser Asp GIu Cys Thr Leu Leu Ser Asn Lys Tyr Leu Asp Gly Asn 
1235 1240 1245 



Ala Asn Lys Ser Thr Ser Asp Gly Ser Pro Vai Met Ala Va! Met 
1250 1255 1260 



Pro Gly Thr Thr Asp Thr lie Gin Val Leu Lys Gly Arg Met Asp 
1265 1270 1275 



Ser GIu Gin Ser Pro Ser lie Gly Tyr Ser Ser Arg Thr Leu Gly 
1280 1285 1290 



Pro Asn Pro Gly Leu lie Leu Gin Ala Leu Thr Leu Ser Asn Ala 
1295 1300 1305 



Ser Asp Gly Phe Asn Leu GIu Arg Leu GIu Met Leu Gly Asp Ser 
1310 1315 1320 



Phe Leu Lys His Ala lie Thr Thr Tyr Leu Phe Cys Thr Tyr Pro 
1325 1330 1335 



Asp Ala His GIu Gly Arg Leu Ser Tyr Met Arg Ser Lys Lys Val 
1340 1345 1350 



Ser Asn Cys Asn Leu Tyr Arg Leu Gly Lys Lys Lys Gly Leu Pro 
1355 1360 1366 



Ser Arg Met Val Val Ser He Phe Asp Pro Pro Val Asn Trp Leu 
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1370 1375 1380 



Pro Pro Gly Tyr Val Val Asn Gin Asp Lys Ser Asn Thr Asp Lys 
1385 1390 1395 



Trp Glu Lys Asp Glu Met Thr Lys Asp Cys Met Leu Ala Asn Gly 
1400 1406 1410 



Lys Leu Asp Glu Asp Tyr Glu Glu Glu Asp Glu Glu Glu Glu Ser 
1415 1420 1425 



Leu Met Trp Arg Ala Pro Lys Glu Glu Ala Asp Tyr Glu Asp Asp 
1430 1435 1440 



Phe Leu Glu Tyr Asp Gin Glu His lie Arg Phe lie Asp Asn Met 
1445 1450 1455 



Leu Met Gly Ser Gly Ala Phe Val Lys Lys lie Ser Leu Ser Pro 
1460 1465 1470 



Phe Ser Thr Thr Asp Ser Ala Tyr Glu Trp Lys Met Pro Lys Lys 
1475 1480 1485 



Ser Ser Leu Gly Ser Met Pro Phe Ser Ser Asp Phe Glu Asp Phe 
1490 1495 1500 



Asp Tyr Ser Ser Trp Asp Ala Met Cys Tyr Leu Asp Pro Ser Lys 
1505 1510 1516 
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A!a Val Glu G!u Asp Asp Phe Val Val Gly Phe Trp Asn Pro Ser 
1520 1525 1530 



Glu Glu Asn Cys Gly Val Asp Thr Gly Lys Gin Ser lie Ser Tyr 
1535 1540 1545 



Asp Leu His Thr Glu Gin Cys lie Ala Asp Lys Ser lie Ala Asp 
1550 - 1555 1560 



Cys Val Glu Ala Leu Leu Gly Cys Tyr Leu Thr Ser Cys Gly Glu 
1565 1570 1575 



Arg Ala Ala Gin Leu Phe Leu Cys Ser Leu Gly Leu Lys Val Leu 
1580 1585 1590 



Pro Val ile Lys Arg Thr Asp Arg Glu Lys Ala Leu Cys Pro Thr 
1595 1600 1605 



Arg Glu Asn Phe Asn Ser Gin Gin Lys Asn Leu Ser Val Ser Cys 
1610 1615 1620 



Ala Ala Ala Ser Val Ala Ser Ser Aug Ser Ser Val Leu Lys Asp 
1625 1630 1635 



Ser Glu Tyr Gly Cys Leu Lys Ile Pro Pro Arg Cys Met Phe Asp 
1640 1645 1650 
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His Pro Asp Ala Asp Lys Thr Leu Asn His Leu lie Ser Gly Phe 
1655 1660 1665 



Glu Asn Phe Glu Lys Lys lie Asn Tyr Arg Phe Lys Asn Lys Ala 
1670 1675 1680 



Tyr Leu Leu Gin Ala Phe Thr His Ala Ser Tyr His Tyr Asn Thr 
1685 1690 1695 



lie Thr Asp Cys Tyr Gin Arg Leu Glu Phe Leu Gly Asp Ala lie 
1700 1705 1710 



Leu Asp Tyr Leu lie Thr Lys His Leu Tyr Glu Asp Pro Arg Gin 
1716 1720 1726 



His Ser Pro Gly Val Leu Thr Asp Leu Arg Ser Ala Leu Val Asn 
1730 1735 1740 



Asn Thr lie Phe Ala Ser Leu Ala Val Lys Tyr Asp Tyr His Lys 
1745 1750 1755 



Tyr Phe Lys Ala Val Ser Pro Glu Leu Phe His Val lie Asp Asp 
1760 1765 1770 



Phe Val Gin Phe Gin Leu Glu Lys Asn Glu Met Gin Gly IVIet Asp 
1775 1780 1785 
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Ser GIu Leu Arg Arg Ser Glu GIu Asp Glu Glu Lys Glu Giu Asp 
1790 1795 1800 



lie Glu Val Pro Lys Ala Met Gly Asp He Phe Glu Ser Leu Ala 
1805 1810 1815 



Gly Aia lie Tyr Met Asp Ser Gly Met Ser Leu Glu Thr Val Trp 
1820 1825 1830 



Gin Val Tyr Tyr Pro Met Met Arg Pro Leu lie Glu Lys Phe Ser 
1835 1840 1845 



Ala Asn Val Pro Arg Ser Pro Val Arg Glu Leu Leu Giu Met Glu 
1850 1855 1860 



Pro Glu Thr Ala Lys Phe Ser Pro Aia Glu Arg Thr Tyr Asp Gly 
1865 1870 1875 



Lys Val Arg Val Thr Val Glu Val Val Gly Lys Gly Lys Phe Lys 
1880 1885 1890 



Gly Val Gly Arg Ser Tyr Arg lie Ala Lys Ser Ala Ala Ala Arg 
1895 1900 1905 



Arg Ala Leu Arg Ser Leu Lys Ala Asn Gin Pro Gin Val Pro Asn 
1910 1915 1920 



Ser 
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<210> 3 
<211> 6750 
<212> DNA 

<213> Drosophila melanogaster 

<220> 
<221> CDS 
<222> (1)..(6750) 

<400> 3 

atg gcg ttc cac tgg tgc gac aac aat ctg cac acc acc gtg ttc acg 48 
Met Ala Phe His Trp Cys Asp Asn Asn Leu His Ttir Thr Val Phe Tlir 
15 10 15 

ccQ cgc gac ttt cag gtg gag eta ctg gcc acc gcc tac gag egg aac 96 
Pro Arg Asp Phe Gin Vai Glu Leu Leu Ala Thr Ala Tyr Glu Arg Asn 
20 25 30 

acg att att tgc ctg ggc cat cga agt tec aag gag ttt ata gcc etc 144 
Tlir He lie Cys Leu Gly His Arg Ser Ser Lys Glu Phe He Ala Leu 
35 40 45 

aag ctg etc cag gag ctg teg cgt cga gca cgc cga cat ggt cgt gtc 1 92 
Lys Leu Leu Gin Giu Leu Ser Arp Arg Ala Arig Arg His Gly Arg Val 
60 55 60 

agt gtc tat etc agt tgc gag gtt ggc acc age acg gaa cca tgc tec 240 
Ser Val Tyr Leu Ser Cys Glu Val Gly Thr Ser Thr Glu Pro Cys Ser 
65 70 75 80 

ate tac acg atg etc acc cac ttg act gac ctg egg gtg tgg cag gag 288 
lie Tyr Thr Met Leu Thr His Leu Thr Asp Leu Arg Val Trp Gin Giu 
85 90 95 
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cag ccg gat atg caa att ccc tit gat cat tgc tgg acg gac tat cac 336 
Gin Pro Asp Met Gin lie Pro Phe Asp His Cys Trp Thr Asp Tyr His 
100 105 110 

gtt tec ate eta egg cea gag gga ttt ett tat ctg etc gaa act cgc 384 
Val Ser lie Leu Arg Pro GIu Gly Phe Leu Tyr Leu Leu Glu Thr Arg 
115 120 125 

gag ctg ctg ctg age age gtc gaa ctg ate gtg ctg gaa gat tgt cat 432 
Glu Leu Leu Leu Ser Ser Val Glu Leu lie Vai Leu GIu Asp Cys His 
130 135 140 

gac age gee gtt tat cag agg ata agg cct ctg ttc gag aat cac att 480 
Asp Ser Ala Val Tyr Gin Arg lie Arg Pro Leu Phe Glu Asn His He 
145 150 155 160 

atg cea gog cea ccg gcg gac agg cea egg att etc gga etc get gga 528 
Met Pro Ala Pro Pro Ala Asp Arg Pro Arg lie Leu Gly Leu Ala Gly 
165 170 176 

ccg ctg cac age gee gga tgt gag ctg cag caa ctg age gee atg ctg 676 
Pro Leu His Ser Ala Gly Cys Glu Leu Gin Gin Leu Ser Ala Met Leu 
180 185 190 

gee ace ctg gag cag agt gtg ctt tgc cag ate gag acg gcc agt gat 624 
Ala Thr Leu Glu Gin Ser Val Leu Cys Gin lie Glu Thr Ala Ser Asp 
195 200 205 

att gtc ace gtg ttg cgt tac tgt tec cga ccg cac gaa tac ate gta 672 
lie Val Thr Val Leu Arg Tyr Cys Ser Arg Pro His Glu Tyr lie Val 
210 215 220 

cag tgc gcc ccc ttc gag atg gac gaa ctg tec ctg gtg ett gee gat 720 
Gin Cys Ala Pro Phe Glu Met Asp Glu Leu Ser Leu Val Leu Ala Asp 
225 . 230 235 240 

gtg etc aac aca cac aag tec ttt tta ttg gac cac cgc tac gat ccc 768 
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Val Leu Asn Thr His Lys Ser Phe Leu Leu Asp His Arg Tyr Asp Pro 
245 250 255 

tac gaa ate tac ggc aca gac cag ttt atg gac gaa ctg aaa gac ata 816 
Tyr Glu lie Tyr Gly Thr Asp Gin Phe Met Asp Glu Leu Lys Asp lie 
260 265 270 

ccc gat ccc aag gtg gac ccc ctg aac gtc ate aac tea eta ctg gtc 864 
Pro Asp Pro Lys Val Asp Pro Leu Asn Val He Asn Ser Leu Leu Val 
275 280 285 

gtg ctg cac gag atg ggt cct tgg tgc acg cag egg get gca cat cac 912 
Val Leu His Glu Met Gly Pro Trp Cys Thr Gin Arg Ala Ala His His 
290 295 300 

ttt tac caa tgc aat gag aag tta aag gtg aag acg ccg cac gaa cgt 960 
Phe Tyr Gin Cys Asn Glu Lys Leu Lys Val Lys Thr Pro His Glu Arg 
305 310 315 320 

cac tac ttg ctg tac tgc eta gtg age acg gee ett ate caa ctg tac 1 008 
His Tyr Leu Leu Tyr Cys Leu Val Ser Thr Ala Leu He Gin Leu Tyr 
325 330 335 

tec etc tge gaa cac gca tte oat cga cat tta gga agt ggc age gat 1 056 
Ser Leu Cys Glu His Ala Phe His Arg His Leu Gly Ser Gly Ser Asp 
340 345 350 

tea cge cag acc ate gaa cgc \3X tec age ccc aag gtg cga cgt ctg 1 104 
Ser Arg Gin Thr lie Glu Arg Tyr Ser Ser Pro Lys Val Arg Arg Leu 
355 360 365 

ttg cag aca ctg agg tgc ttc aag ccg gaa gag gtg cac ace caa geg 1152 
Leu Gin Thr Leu Arg Cys Phe Lys Pro Giu Glu Val His Thr Gin Ala 
370 375 380 

gac gga ctg cgc aga atg egg cat cag gtg gat cag gcg gac ttc aat 1 200 
Asp Gly Leu Arg Arg Met Arg His Gin Val Asp Gin Ala Asp Phe Asn 
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385 390 395 400 

egg tta tct cat acg ctg gaa age aag tgc cga atg gtg gat caa atg 1248 
Arg Leu Ser His Thr Leu Glu Ser Lys Cys Arg Met Val Asp Gin Met 
405 410 415 

gac caa cog ccg acg gag aca cga gcc ctg gtg gcc act ctt gag cag 1296 
Asp Gin Pro Pro Thr Glu Thr Arg Ala Leu Val Ala Thr Leu Glu Gin 
420 425 430 

att ctg cac acg aca gag gac agg cag acg aac aga age gcc get egg 1 344 
lie Leu His Thr Thr Glu Asp Arg Gin Thr Asn Arg Ser Ala Ala Arg 
435 440 445 

gtg act cct act cot act ccc get cat gcg aag ccg aaa cet age tct 1 392 
Val Thr Pro Thr Pro Thr Pro Ala His Ala Lys Pro Lys Pro Ser Ser 
450 465 460 

ggt gcc aac act gea caa cca cga act cgt aga cgt gtg tac ace agg 1440 

Gly Ala Asn Thr Ala Gin Pro Arg Thr Arg Arg Arg Val Tyr Thr Arg . J 

465 470 475 480 

cgc cac cac egg gat cac aat gat ggc age gac acg etc tgc gea ctg 1488 
Arg His His Arg Asp His Asn Asp Gly Ser Asp Thr Leu Cys Ala Leu 
485 490 495 

att tac tgc aac cag aac cac acg get cgc gtg etc ttt gag ctt eta 1 636 
lie Tyr Cys Asn Gin Asn His Thr Ala Arg Val Leu Phe Glu Leu Leu 
500 505 510 

gcg gag att age aga cgt gat ccc gat etc aag ttc eta cgc tgc cag 1584 
Ala Glu He Ser Arg Arg Asp Pro Asp Leu Lys Phe Leu Arg Cys Gin 
515 520 525 

tac ace acg gac egg gtg gea gat ccc ace acg gag ccc aaa gag get 1 632 
Tyr Thr Thr Asp Arg Val Ala Asp Pro Thr Thr Glu Pro Lys Glu Ala 
630 535 540 
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gag ttg gag cac egg egg cag gaa gag gtg eta aag cgc ttc cgc atg 1 680 
Glu Leu Glu His Arg Arg Gin Glu Glu Val Leu Lys Arg Phe Arg Met 
645 550 565 560 

cat gac tgc aat gtc ctg ate ggt act teg gtg ctg gaa gag ggc ate 1728 
His Asp Cys Asn Val Leu lie Gly Thr Ser Val Leu Glu Glu Gly lie 
565 570 575 

gat gtg ccc aag tgc aat ttg gtt gtg cgc tgg gat ccg eca acc aca 1 776 
Asp V^l Pro Lys Cys Asn Leu Val Val Arg Trp Asp Pro Pro Thr Thr 
580 585 590 

tat cgc agt tac gtt cag tgc aaa ggt cga gee cgt get get cca gee 1 824 
Tyr Arg Ser Tyr Val Gin Cys Lys Gly Arg Ala Arg Ala Ala Pro Ala 
595 600 605 

tat cat gtc att etc gtc get ccg agt tat aaa age cca act gtg ggg 1 872 
Tyr His Val lie Leu Val Ala Pro Ser Tyr Lys Ser Pro Thr Val Gly 
610 615 , 620 

tea gtg cag ctg acc gat egg agt cat egg tat att tgc gcg act ggt 1 920 
Ser Val Gin Leu Thr Asp Arg Ser His Arg Tyr tie Cys Aia Thr Gly 
625 630 635 640 

gat act aca gag gcg gac age gac tct gat gat tea gcg atg eca aac 1 968 
Asp Thr Thr Glu Ala Asp Ser Asp Ser Asp Asp Ser Ala Met Pro Asn 
646 650 655 

teg tec ggc teg gat ccc tat act ttt ggc aeg gea cgc gga acc gtg 201 6 
Ser Ser Gly Ser Asp Pro Tyr Thr Phe Gly Thr Ala Arg Gly Thr Val 
660 665 670 

aag ate etc aac ccc gaa gtg ttc agt aaa caa cca ccg aca gcg tgc 2064 
Lys lie Leu Asn Pro Glu Val Phe Ser Lys Gin Pro Pro Thr Ala Cys 
675 680 685 
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gac att aag ctg cag gag ate cag gac gaa ttg cca gcc gca gcg cag 2112 
Asp lie Lys Leu Gin Glu lie Gin Asp Glu Leu Pro Ala Ala Ala Gin 
690 695 700 

ctg gat acg age aac tee age gac gaa gcc gtc age atg agt aac acg 2160 
Leu Asp Thr Ser Asn Ser Ser Asp Glu Ala Val Ser Met Ser Asn Thr 
705 710 715 720 

tct cca age gag age agt aca gaa caa aaa tec aga cgc tte cag tgc 2208 
Ser Pro Ser Glu Ser Ser Thr Glu Gin Lys Ser Arg Arg Phe Gin Cys 
725 730 735 

gag ctg age tct tta acg gag cca gaa gac aca agt gat act aca gcc 2256 
Glu Leu Ser Ser Leu Thr Glu Pro Glu Asp Thr Ser Asp Thr Thr Ala 
740 746 750 

gaa ate gat act got cat agt tta gcc age acc acg aag gac ttg gtg 2304 
Glu lie Asp Thr Ala His Ser Leu Ala Ser Thr Thr Lys Asp Leu Val 
755 760 765 

cat caa atg gca cag tat cgc gaa ate gag cag atg ctg eta tec aag 2352 
His Gin Met Ala Gin Tyr Arg Glu lie Glu Gin Met Leu Leu Ser Lys 
770 775 780 

tgc gcc aac aca gag ccg ccg gag cag gag cag agt gag gcg gaa cgt 2400 
Cys Ala Asn Thr Glu Pro Pro Glu Gin Glu Gin Ser Glu Ala Glu Arg 
785 790 795 800 

ftt agt gcc tgc ctg gcc gca tac ega ccc aag ccg cac ctg eta aca 2448 
Phe Ser Ala Cys Leu Ala Ala Tyr Arg Pro Lys Pro His Leu Leu Thr 
805 810 815 

ggc gcc tec gtg gat ctg ggt tct get ata get ttg gtc aac aag tac 2496 
Gly Ala Ser Val Asp Leu Gly Ser Ala lie Ala Leu Val Asn Lys Tyr 
820 825 830 

tgc gcc cga ctg cca age gac acg tte acc aag ttg acg gcg ttg tgg 2644 
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Cys Ala Arg Leu Pro Ser Asp Thr Phe Thr Lys Leu Thr Ala Leu Trp 
835 840 845 

cgc tgc acc cga aac gaa agg get gga gtg acc ctg ttt cag tac aca 2592 
Arg Cys Thr Arg Asn GIu Arg Ala Gly Val Thr Leu Phe Gin Tyr Thr 
850 855 860 

etc cgt ctg ccc ate aac teg cca ttg aag cat gac att gtg ggt ctt 2640 
Leu Arg Leu Pro He Asn Ser Pro Leu Lys His Asp lie Val Gly Leu 
865 870 875 880 

ccg atg cca act caa aca ttg gee cgc cga ctg get gee ttg cag get 2688 
Pro Met Pro Thr Gin Thr Leu Ala Arg Arg Leu Ala Ala Leu Gin Ala 
885 890 895 

tgc gtg gaa ctg cac agg ate ggt gag tta gac gat cag ttg cag cet 2736 
Cys Val GIu Leu His Arg lie Gly GIu Leu Asp Asp Gin Leu Gin Pro 
900 905 910 

ate gge aag gag gga ttt cgt gee ctg gag ccg gac tgg gag tgc ttt 2784 
lie Gly Lys GIu Gly Phe Arg Ala Leu GIu Pro Asp Tfp GIu Cys Phe 
915 920 925 

gaa ctg gag cca gag gac gaa cag att gtg cag eta age gat gaa cca 2832 
GIu Leu GIu Pro GIu Asp GIu Gin He Val Gin Leu Ser Asp GIu Pro 
930 935 940 

cgt ccg gga aca acg aag cgt cgt cag tac tat tac aaa cgc att gca 2880 
Arg Pro Gly Thr Thr Lys Arg Arg Gin Tyr Tyr Tyr Lys Arg lie Ala 
945 950 955 960 

tec gaa ttt tgc gat tgc cgt ccc gtt gee gga geg cca tgc tat ttg 2928 
Ser GIu Phe Cys Asp Cys Arg Pro Val Ala Gly Ala Pro Cys Tyr Leu 
965 970 975 

tac ttt ate caa ctg acg etc caa tgt ccg att ccc gaa gag caa aac 2976 
Tyr Phe lie Gin Leu Thr Leu Gin Cys Pro lie Pro GIu GIu Gin Asn 
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980 985 990 

acg egg gga cgc aag att tat ccg ccc gaa gat gcg cag cag gga ttc 3024 
Thr Arg Gly Arg Lys lie Tyr Pro Pro Glu Asp Ala Gin Gin Gly Phe 
995 1000 1005 

ggc att eta ace ace aaa egg ata ecc aag ctg agt get ttc teg 3069 
Gly tie Leu Thr Thr Lys Arg lie Pro Lys Leu Ser Ala Phe Ser 
1010 1015 1020 

ata ttc aeg cgt tec ggt gag gtg aag gtt tee ctg gag tta get 3114 
lie Phe Thr Arg Ser Gly Glu Val Lys Val Ser Leu Glu Leu A!a 
1025 1030 1035 

aag gaa cgc gtg att eta act age gaa caa ata gtc tge ate aac 3159 
Lys Glu Arg Val lie Leu Thr Ser Glu Gin lie Val Cys lie Asn 
1040 1045 1050 

gga ttt tta aac tac acg ttc acc aat gta ctg cgt ttg caa aag 3204 
Gly Phe Leu Asn Tyr Thr Phe Thr Asn Val Leu Arg Leu Gin Lys . 
1056 1060 1065 

ttt ctg atg etc ttc gat ccg gac tee acg gaa aat tgt gta ttc 3249 
Phe Leu Met Leu Phe Asp Pro Asp Ser Thr Glu Asn Cys Val Phe 
1070 1075 1080 

att gtg ccc ace gtg aag gea oca get ggc ggc aag cac ate gac 3294 
lie Val Pro Thr Val Lys Ala Pro Ala Gly Gly Lys His lie Asp 
1086 1090 1095 

td9 cag ttt ctg gag ctg ate caa gcg aat gga aat aca atg cea 3339 
TrpGIn Phe Leu Glu Leu lie Gin Ala Asn Gly Asn Thr Met Pro 
1100 1105 1110 

egg gea gtg ecc gat gag gag cgc cag gcg cag ccg ttt gat ccg 3384 
Arg Ala Val Pro Asp Glu Glu Arg Gin Ala Gin Pro Phe Asp Pro 
1115 1120 1125 
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caa cgc ttc cag gat gcc gtc gtt atg ccg tgg tat cgc aac cag 3429 
Gin Arg Phe Gin Asp Ala Val Val Met Pro Trp Tyr Arg Asn Gin 
1130 1135 1140 

gat caa ccg cag tat ttc tat gtg gcg gag ata tgt cca cat eta 3474 
Asp Gin Pro Gin Tyr Phe Tyr Val Ala GIu He Cys Pro His Leu 
1145 1150 1155 

tec cca etc age tgc ttc cct ggt gac aac tac cgc acg ttc aag 351 9 
Ser Pro Leu Ser Cys Phe Pro Gly Asp Asn Tyr Arg Thr Phe Lys 
1160 1165 1170 

eac tac tac etc gtc aag tat ggt ctg acc ata eag aat ace teg 3564 
His Tyr Tyr Leu Val Lys Tyr Giy Leu Thr lie Gin Asn Thr Ser 
1175 1180 1185 

cag ccg eta ttg gac gtg gat eac acc agt gcg egg tta aac ttc 3609 
Gin Pro Leu Leu Asp Val Asp His Thr Ser Ala .Arg Leu Asn Phe 
1190 1195 1200 

etc acg cca cga tac gtt aat cgc aag ggc gtt get ctg ccc act 3654 
Leu Thr Pro Arg Tyr Val Asn Arg Lys Gly Val Ala Leu Pro Thr 
1205 1210 1215 

agt teg gag gag aea aag egg gea aag cgc gag aat etc gaa cag 3699 
Ser Ser Glu GIu Thr Lys Arg Ala Lys Arg GIu Asn Leu GIu Gin 
1220 1225 1230 

aag cag ate ctt gtg cca gag etc tgc act gtg cat cca ttc ccc 3744 
Lys Gin He Leu Val Pro GIu Leu Cys Thr Val His Pro Phe Pro 
1235 1 240 1245 

gcc tec ttg tgg cga act gcc gtg tgc ctg ccc tgc ate ctg tac 3789 
Ala Ser Leu Trp Arg Thr Ala Val Cys Leu Pro Cys He Leu Tyr 
1250 1255 1260 
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cgcata aat ggt ctt eta ttg gcc gac gat att egg aaaeaggtt 3834 
Arg He Asn Gly Leu Leu Leu Ala Asp Asp lie Arg Lys Gin Val 
1265 1270 1275 

tct gcg gat ctg ggg etg gga agg eaa eag ate gaa gat gag gat 3879 
Ser Ala Asp Leu Gly Leu Gly Arg Gin Gin He Glu Asp Giu Asp 
1280 1285 1290 . 

ttc gag tgg ece atg ctg gae ttt ggg tgg agt eta teg gag gtg 3924 
Phe Glu Trp Pro Met Leu Asp Phe Gly Trp Ser Leu Ser Glu Val 
1295 1300 1305 

etc aag aaa teg egg gag tee aaa caa aag gag tec ctt aag gat 3969 
Leu Lys Lys Ser Arg Giu Ser Lys Gin Lys Glu Ser Leu Lys Asp 
1310 1315 1320 

gat act att aat ggc aaa gac tta get gat gtt gaa aag aaa ccg 4014 
Asp Thr lie Asn Gly Lys Asp Leu Ala Asp Val Glu Lys Lys Pro 
1325 1330 1335 

act age gag gag ace caa eta gat aag gat tea aaa gac gat aag . 4059 
Thr Ser Glu Glu Thr Gin Leu Asp Lys Asp Ser Lys Asp Asp Lys 
1340 1346 1350 

gtt gag aaa agt get att gaa eta ate att gag gga gag gag aag 41 04 
Val Glu Lys Ser Ala lie Glu Leu lie lie Glu Gly Glu Glu Lys 
1365 ■ 1360 1365 

ctg caa gag get gat gae ttc att gag ata ggc act tgg tea aac 4149 
Leu Gin Glu Ala Asp Asp Ptie lie Glu lie Gly Thr Trp Ser Asn 
1370 1375 1380 

gat atg gcc gac gat ata get agt ttt aac caa gaa gac gac gac 41 94 
Asp Met Ala Asp Asp lie Ala Ser Phe Asn Gin Glu Asp Asp Asp 
1385 1390 1395 

gag gat gac gcc ttc cat etc cca gtt tta ccg gca aac gtt aag 4239 
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Glu Asp Asp A!a Phe His Leu Pro Vai Leu Pro Ala Asn Val Lys 
1400 1405 1410 

ttc tgt gat cag caa acg cgc tac ggt teg ccc aca ttt tgg gat 4284 
PheCys Asp Gin Gin Thr Arg Tyr Gly Ser Pro Thr PheTrpAsp 
1415 1420 1425 

gtg age aat ggc gaa age ggc ttc aag ggt cca aag ago agt cag 4329 
Val Ser Asn Gly Glu Ser Gly Phe Lys Gly Pro Lys Ser Ser Gin 
1430 1435 1440 

aat aag cag ggt ggc aag ggc aaa gca aag ggt ccg gca aag ccc 4374 
Asn Lys Gin Gly Gly Lys Gly Lys Ala Lys Gly Pro Ala Lys Pro 
1446 1450 1455 

aca ttt aac tat tat gac teg gac aat teg ctg ggt tec age tac 4419 
Thr Phe Asn Tyr Tyr Asp Ser Asp Asn Ser Leu Gly Ser Ser Tyr 
1460 1465 1470 

gat gac gac gat aac gca ggt ccg etc aat tac atg cat cac aac 4464 
Asp Asp Asp Asp Asn Ala Gly Pro Leu Asn Tyr Met His His Asn 
1475 1480 1485 

tac agt teg gat gac gac gat gtg gca gat gat ate gat gcg gga 4609 
Tyr Ser Ser Asp Asp Asp Asp Val Ala Asp Asp lie Asp Ala Gly 
1490 1495 1500 

cgc att gcg ttc ace tec aag aat gaa gcg gag act att gaa ace 4564 
Arg lie Ala Phe Thr Ser Lys Asn Glu Ala Glu Thr lie Glu Thr 
1505 1510 1515 

gca cag gaa gtg gaa aag cgc cag aag cag ctg tec ate ate cag 4599 
Ala Gin Glu Val Glu Lys Arg Gin Lys Gin Leu Ser lie lie Gin 
1620 1525 1530 

gcg ace aat get aac gag egg cag tat cag cag aca aag aac ctg 4644 
Ala Thr Asn Ala Asn Glu Arg Gin Tyr Gin Gin Thr Lys Asn Leu 
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1535 1540 1645 

ctcatt gga ttc aat ttt aag cat gag gac cag aag gaacctgcc 4689 
Leu He Gly Phe Asn Phe Lys His Glu Asp Gin Lys Glu Pro Ala 
1550 1655 1560 

act ata aga tat gaa gaa tec ata get aag etc aaa acg gaa ata 4734 
Thr lie Arg Tyr G!u Glu Ser lie Ala Lys Leu Lys Thr Glu lie 
1566 1570 1575 

gaa tec ggc ggc atg ttg gtg ccg cac gac cag cag ttg gtt eta 4779 
Glu Ser Gly Gly Met Leu Val Pro His Asp Gin Gin Leu Val Leu 
1580 1585 1590 

aaa aga agt gat gcc get gag get cag gtt gca aag gtatcgatg 4824 
Lys Arg Ser Asp Ala Ala Glu Ala Gin Val Ala Lys Val Ser Met 
1595 1600 1605 

atg gag eta ttg aag cag ctg ctg ccg tat gta aat gaa gat gtg 4869 
Met Glu Leu Leu Lys Gin Leu Leu Pro Tyr Val Asn Glu Asp Val 
1610 1615 1620 

ctg gcc aaa aag ctg ggt gat agg cgc gag ctt ctg ctg teg gat 4914 
Leu Ala Lys Lys Leu Gly Asp Arg Arg Glu Leu Leu Leu Ser Asp 
1626 1630 1635 

ttg gta gag eta aat gca gat tgg gta gcg ega cat gag cag gag 4959 
Leu Val Glu Leu Asn Ala Asp Trp Val Ala Arg His Glu Gin Glu 
1640 1645 1650 

acc tac aat gta atg gga tgc gga gat agt ttt gac aac tat aac 5004 
Thr Tyr Asn Val Met Gly Cys Gly Asp Ser Phe Asp Asn Tyr Asn 
1655 1660 1665 

gat cat cat egg ctg aac ttg gat gaa aag caa ctg aaa ctg caa 5049 
Asp His His Arg Leu Asn Leu Asp Glu Lys Gin Leu Lys Leu Gin 
1670 1675 1680 
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tac gaa cga att gaa att gag cca cct act tec acg aag gcc ata 5094 
Tyr Glu Arg lie Glu lie Glu Pro Pro Thr Ser Thr Lys Ala lie 
1685 1690 1695 

acc tea gcc ata tta cca get ggc ttc agt ttc gat cga caa ccg 5139 
Thr Ser Ala lie Leu Pro Ala Giy Phe Ser Phe Asp Arg Gin Pro 
1700 1706 1710 

gat Ota gtg ggc oat cca gga ccc agt ccc ago ate attttgeaa 5184 
Asp Leu Val Gly His Pro Gly Pro Ser Pro Ser lie lie Leu Gin 
1715 1720 1725 

gcc etc aca atg tec aat get aac gat ggc ate aat ctg gag cga 5229 
Ala Leu Thr Met Ser Asn Ala Asn Asp Gly He Asn Leu Glu Arg 
1730 1735 1740 

ctg gag aca att gga gat tec ttt eta aag tat gee att acc acc 5274 
Leu Glu Thr lie Giy Asp Ser Phe Leu Lys Tyr Ala lie Thr Thr 
1745 1750 1755 

tac ttg tac ate acc tac gag aat gtg cac gag gga aaa eta agt 531 9 
Tyr Leu Tyr lie Thr Tyr Glu Asn Val His Glu Gly Lys Leu Ser 
1760 1765 1770 

cac ctg cge tee aag cag gtt gcc aat etc aat etc tat cgt ctg 5364 
His Leu Arg Ser Lys Gin Val Ala Asn Leu Asn Leu Tyr Arg Leu 
1775 1780 1785 

ggc aga cgt aag aga ctg ggt gaa tat atg ata gee act aaa ttc 5409 
Gly Arg Arg Lys Arg Leu Gly Glu Tyr Met lie Ala Thr Lys Phe 
1790 1795 1800 

gag ccg cac gac aat tgg ctg cca ccc tge tac tac gtg cca aag 5464 
Glu Pro His Asp Asn Trp Leu Pro Pro Cys Tyr Tyr Val Pro Lys 
1805 1810 1815 
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gag eta gag sag gcg etc ate gag gcg aag ate ccc act cac cat 5499 
GIu Leu Glu Lys Ala Leu lie GIu Ala Lys lie Pro Thr His His 
1820 1825 1830 

tgg aag ctg gcc gat ctg eta gac att aag aac eta age agt gtg 5544 
Trp Lys Leu Ala Asp Leu Leu Asp He Lys Asn Leu Ser Ser Val 
1835 1840 1845 

caa ate tgc gag atg gtl cgc gaa aaa gee gat gcc ctg ggc ttg 5589 
Gin He Cys GIu Met Val Arg Giu Lys Ala Asp Ala Leu Gly Leu 
1850 1855 1860 

gag cag aat ggg ggt gcc caa aat gga caa ctt gac gac tec aat 5634 
GIu Gin Asn Gly Gly Ala Gin Asn Gly Gin Leu Asp Asp Ser Asn 
1865 1870 1875 

gat age tgc aat gat ttt age tgt ttt att ccc tae aaccttgtt 6679 
Asp Ser Cys Asn Asp Phe Ser Cys Phe He Pro Tyr Asn Leu Val 
1880 1885 1890 

teg caa cac age att ccg gat aag tct att gcc gat tgc gte gaa 5724 
Ser Gin His Ser He Pro Asp Lys Ser He Ala Asp Cys Val Glu 
1895 1900 1905 

gcc etc att gga gcc tat etc att gag tgc gga ccc cga ggg get 5769 
Ala Leu He Gly Ala Tyr Leu lie Glu Cys Gly Pro Arg Gly Ala 
1910 1915 1920 

tta etc ttt atg gcc tgg ctg ggc gtg aga gtg etc cet ate aca 5814 
Leu Leu Phe Met Ala Trp Leu Gly Val Arg Val Leu Pro lie Thr 
1925 1930 1935 

agg cag ttg gac ggg ggt aac cag gag caa cga ata ccc ggt age 5869 
Arg Gin Leu Asp Gly Gly Asn Gin Glu Gin Arg He Pro Gly Ser 
1940 1945 1950 

aca aaa ccg aat gcc gaa aat gtg gte ace gtt tac ggt gea tgg 5904 
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Thr Lys Pro Asn Ala GIu Asn Val Val Thr Val Tyr Gly Ala Trp 
1955 1960 1965 

ccc acg ccg cgt agt cca ctg ctg cac ttt get cca aat get acg 5949 
Pro Thr Pro Arg Ser Pro Leu Leu His Phe Ala Pro Asn Ala Thr 
1970 1975 1980 

gag gag ctg gac cag tta eta age ggc ttt gag gag ttt gag gag 5994 
GIu GIu Leu Asp Gin Leu Leu Ser Gly Phe GIu GIu Phe GIu GIu 
1985 1990 1995 

sgt ttg gga tac aag ttc egg gat egg teg tac e^ ttg caa gee 6039 
Ser Leu Gly Tyr Lys Phe Arg Asp Arg Ser Tyr Leu Leu Gin Ala 
2000 2005 2010 

atg aca cat gee agt tac acg ccc aat cga ttg acg gat tge tat 6084 
Met Thr His Ala Ser Tyr Thr Pro Asn Arg Leu Thr Asp Cys Tyr 
2015 2020 2025 

cag cgt ctg gag ttc ctg ggc gat get gtt eta gat tac etc att 6129 
Gin Arg Leu GIu Phe Leu Gly Asp Ala Val Leu Asp Tyr Leu lie 
2030 2035 2040 

acg egg cat tta tac gaa gat ccc egc cag cat tct cca ggc gca 6174 
Thr Arg His Leu Tyr GIu Asp Pro Arg Gin His Ser Pro Gly Ala 
2045 2050 2055 

tta acg gat ttg egg tea gca ctg gtg aat aat aca atattcgcc 6219 
Leu Thr Asp Leu Arg Ser Ala Leu Val Asn Asn Thr lie Phe Ala 
2060 2065 2070 

tec ctg get gtt egc cat ggc ttc cac aag ttc ttc egg cac etc 6264 
Ser Leu Ala Val Arg His Gly Phe His Lys Phe Phe Arg His Leu 
2075 2080 2085 

teg ccg ggc ctt aac gat gtg att gac cgt ttt gtg egg ate cag 6309 
Ser Pro Gly Leu Asn Asp Val lie Asp Arg Phe Val Arg lie Gin 
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2090 2095 2100 

cag gag aat gga cac tgc ate agt gag gag tac tac tta ttg tec 6354 
Gin Glu Asn Gly His Cys He Ser Glu Glu Tyr Tyr Leu Leu Ser 
2105 2110 2115 

gag gag gag tgc gat gac gcc gag gac gtt gag gtg ccc aag gca 6399 
Glu Glu Glu Cys Asp Asp Ala Glu Asp Val Glu Val Pro Lys Ala 
2120 2125 2130 

ttg ggc gac gtt ttc gag teg ate gca ggt gcc att ttt etc gac 6444 
Leu Gly Asp Val Phe Glu Ser lie Ala Gly Ala He Phe Leu Asp 
2135 2140 2145 . 

tea aac atg teg ctg gac gtg gtt tgg cac gta tat age aac atg 6489 
Ser Asn Met Ser Leu Asp Val Val Trp His Val Tyr Ser Asn Met 
2150 2155 2160 

atg age ceg gag ate gag cag ttc age aac tea gtg cca aaa teg 6534 
Met Ser Pro Glu lie Glu Gin Phe Ser Asn Ser Val Pro Lys Ser 
2165 2170 2175 

ccc att egg gag etc etc gag ctg gag ceg gaa ace gcc aag ttc 6579 
Pro lie Arg Glu Leu Leu Glu Leu Glu Pro Glu Thr Ala Lys Phe 
2180 2185 2190 

r 

ggc aag ccc gag aag ctg gcg gat ggg cga egg gtg cge gtt acc 6624 
Gly Lys Pro Glu Lys Leu Ala Asp Gly Arg Arg Val Arg Val Thr 
2195 2200 2205 

gtg gat gtc ttc tgc aaa gga acc ttc cgt ggc ate gga egc aac 6669 
Val Asp Val Phe Cys Lys Gly Thr Phe Arg Gly lie Gly Arg Asn 
2210 2215 2220 

tat egc att gee aag tgc acg gcg gcc aaa tgc gca ttgcgceaa 6714 
Tyr Arg lie Ala Lys Cys Thr Ala Ala Lys Cys Ala Leu Arg Gin 
2225 2230 2235 
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ctcaaa aag cag ggc ttg ata gcc aaa aaa gac taa 6750 
Leu Lys Lys Gin Gly Leu ile Ala Lys Lys Asp 
2240 2245 



<210> 4 
<211> 2249 
<212> PRT 

<213> Drosophila melanogaster 
<400> 4 

Met Ata Phe His Tip Cys Asp Asn Asn Leu His Thr Thr Val Phe Thr 
15 10 15 



Pro Arg Asp Phe Gin Val Glu Leu Leu Ala Thr Ala Tyr Glu Arg Asn 
20 25 30 



Thr tie Ile Cys Leu Gly His Arg Ser Ser Lys Glu Phe lie Ala Leu 
35 40 46 



Lys Leu Leu Gin Glu Leu Ser Arg Arg Ala Arg Arg His Gly Arg Val 
50 55 60 



Ser Val Tyr Leu Ser Cys Glu Val Gly Thr Ser Thr Glu Pro Cys Ser 
65 70 75 80 



Ile Tyr Thr Met Leu Thr His Leu Thr Asp Leu Arg Val Trp Gin Glu 
85 90 95 
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Gin Pro Asp Met Gin He Pro Phe Asp His Cys Trp Thr Asp Tyr His 
100 105 110 



Val Ser lie Leu Arg Pro G(u Gly Phe Leu Tyr Leu Leu Glu Thr Arg 
115 120 125 



Glu Leu Leu Leu Ser Ser Val Glu Leu lie Val Leu Glu Asp Cys His 
130 135 140 



Asp Ser Ala Val Tyr Gin Arg tie Arg Pro Leu Phe Glu Asn His lie 
145 150 155 160 



Met Pro Ala Pro Pro Ala Asp Arg Pro Arg lie Leu Gly Leu Ala Gly 
165 170 175 



Pro Leu His Ser Ala Gly Cys Glu Leu Gin Gin Leu Ser Ala Met Leu 
180 185 190 



Ala Thr Leu Glu Gin Ser Val Leu Cys Gin lie Glu Thr Ala Ser Asp 
195 200 205 



lie Val Thr Val Leu Arg Tyr Cys Ser Arg Pro His Glu Tyr lie Val 
210 215 220 



. Gin Cys Ala Pro Phe Glu Met Asp Glu Leu Ser Leu Val Leu Ala Asp 
225 230 235 240 



Val Leu Asn Thr His Lys Ser Phe Leu Leu Asp His Arg Tyr Asp Pro 
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245 250 255 



Tyr Glu lie Tyr Gly Thr Asp Gin Phe Met Asp GIu Leu Lys Asp He 
260 265 270 



Pro Asp Pro Lys Val Asp Pro Leu Asn Val lie Asn Ser Leu Leu Val 
275 280 285 



Val Leu His Glu Met Gly Pro Trp Cys Thr Gin Arg Ala Ala His His 
290 295 300 



Phe Tyr Gin Cys Asn Glu Lys Leu Lys Val Lys Thr Pro His Glu Arg 
305 310 315 320 



His Tyr Leu Leu Tyr Cys Leu Val Ser Thr Ala Lqu He Gin Leu Tyr 
325 330 335 



Ser Leu Cys Glu His Ala Phe His Arg His Leu Gly Ser Gly Ser Asp 
340 345 350 



Ser Arg Gin Thr lie Glu Arg Tyr Ser Ser Pro Lys Val Arg Arg Leu 
355 360 365 



Leu Gin Thr Leu Arg Cys Phe Lys Pro Glu Glu Val His Thr Gin Ala 
370 375 380 



Asp Gly Leu Arg Arg Met Arg His Gin Val Asp Gin Ala Asp Phe Asn 
385 390 395 400 
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Arg Leu Ser His Thr Leu GIu Set Lys Cys Arg Met Val Asp Gin Met 
405 410 415 



Asp Gin Pro Pro Thr GIu Thr Arg Ala Leu Val Ala Thr Leu GIu Gin 
420 425 430 



lie Leu His Thr Thr GIu Asp Arg Gin Thr Asn Arg Ser Ala Ala Arg 
435 440 445 



Val Thr Pro Thr Pro Thr Pro Ala His Ala Lys Pro Lys Pro Ser Ser 
450 455 460 



Giy Ala Asn Thr Ala Gin Pro Arig Thr Arg Arg Arg Val Tyr Thr Arg 
465 470 475 480 



Arg His His Arg Asp His Asn Asp Gly Ser Asp Thr Leu Cys Ala Leu 
485 490 495 



lie Tyr Cys Asn Gin Asn His Thr Ala Arg Val Leu Phe GIu Leu Leu 
500 505 510 



Ala GIu He Ser Arg Arg Asp Pro Asp Leu Lys Phe Leu Arg Cys Gin 
515 520 525 



Tyr Thr Thr Asp Arg Val Ala Asp Pro Thr Thr GIu Pro Lys GIu Ala 
530 535 540 
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Glu Leu Glu His Arg Arg Gin Glu Glu Val Leu Lys Arg Phe Arg Met 
545 550 555 560 



His Asp Cys Asn Val Leu lie Gly Thr Ser Val Leu Glu Glu Gly He 
565 570 575 



Asp Val Pro Lys Cys Asn Leu Val Val Arg Trp Asp Pro Pro Thr Thr 
580 585 590 



Tyr Arg Ser Tyr Val Gin Cys Lys Gly Arg Ala Arg Ala Ala Pro A!a 
595 600 605 



Tyr His Val lie Leu Val Ala Pro Ser Tyr Lys Ser Pro Thr Val Gly 
610 615 620 



Ser Val Gin Leu Thr Asp Arg Ser His Arg Tyr lie Cys Ala Thr Gly 
625 630 635 640 



Asp Thr Thr Glu Ala Asp Set Asp Ser Asp Asp Ser Ala Met Pro Asn 
645 650 655 



Ser Ser Gly Ser Asp Pro Tyr Thr Phe Gly Thr Ala Ar^ Gly Thr Val 
660 665 670 



Lys lie Leu Asn Pro Glu Val Phe Ser Lys Gin Pro Pro Thr Ala Cys 
675 680 685 
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Asp lie Lys Leu Gin GIu lie Gin Asp Glu Leu Pro A!a Ala Ala Gin 
690 695 700 



Leu Asp Thr Ser Asn Ser Ser Asp Glu Ala Val Ser Met Ser Asn Thr 
705 710 715 720 



Ser Pro Ser Glu Ser Ser Thr Glu Gin Lys Ser Arg Arg Phe Gin Cys 
725 730 735 



Glu Leu Ser Ser Leu Thr Glu Pro Glu Asp Thr Set Asp Thr Thr Ala 
740 745 750 



Glu tie Asp Thr Ala His Ser Leu Aia Ser Thr Thr Lys Asp Leu Val 
755 760 765 



His Gin Met Ala Gin Tyr Arg Giu lie Glu Gin Met Leu Leu Ser Lys 
770 775 780 



Cys Ala Asn Thr Glu Pro Pro Glu Gin Glu Gin Ser Glu Ala Glu Arg 
785 790 795 800 



Phe Ser Aia Cys Leu Ala Ala Tyr Arg Pro Lys Pro His Leu Leu Thr 
805 810 816 



Gly Ala Ser Val Asp Leu Giy Ser Ala lie Ala Leu Val Asn Lys Tyr 
820 825 830 



Cys Ala Arg Leu Pro Ser Asp Thr Phe Thr Lys Leu Thr Ala Leu Trp 
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835 840 845 



Arg Cys Thr Arg Asn Glu Arg Ala Gly Val Thr Leu Phe Gin Tyr Thr 
850 855 860 



Leu Arg Leu Pro lie Asn Ser Pro Leu Lys His Asp lie Val Gly Leu 
865 870 875 880 



Pro Met Pro Thr Gin Thr Leu Ala Arg Arg Leu Ala Ala Leu Gin Ala 
885 890 895 



Cys Val Glu Leu His Arg lie Gly Glu Leu Asp Asp Gin Leu Gin Pro 
900 906 910 



lie Gly Lys Glu Gly Phe Arg Ala Leu Glu Pro Asp Trp Glu Cys Phe 
916 920 925 



Glu Leu Glu Pro Glu Asp Glu Gin lie Val Gin Leu Ser Asp Glu Pro 
930 935 940 



Arg Pro Gly Thr Thr Lys Arg Ar^ Gin Tyr Tyr Tyr Lys Arg tie Ala 
945 950 955 960 



Ser Glu Phe Cys Asp Cys Arg Pro Val Ala Gly Ala Pro Cys Tyr Leu 
965 970 975 



Tyr Phe lie Gin Leu Thr Leu Gin Cys Pro lie Pro Glu Glu Gin Asn 
980 985 990 
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Thr Arg Gly Arg Lys He Tyr Pro Pro Glu Asp Aia Gin Gin Gly Phe 
995 1000 1005 



Gly lie Leu Thr Thr Lys Arg He Pro Lys Leu Ser Ala Phe Ser 
1010 1015 1020 



lie Phe Thr Arg Ser Gly Glu Val Lys Val Ser Leu Glu Leu Ala 
1025 1030 1035 



Lys Glu Arg Vai He Leu Thr Ser Glu Gin He Val Cys lie Asn 
1040 1045 1050 



Gly Phe Leu Asn Tyr Thr Phe ThrAsn ValLeu Arg Leu Gin Lys 
1055 1060 1065 



Phe Leu Met Leu Phe Asp Pro Asp Ser Thr Glu Asn Cys Val Phe 
1070 1075 1080 



lie Val Pro Thr Val Lys Ala Pro Ala Gly Gly Lys His lie Asp 
1085 1090 1095 



Trp Gin Phe Leu Glu Leu lie Gin Ala Asn Gly Asn Thr (Viet Pro 
1100 1105 1110 



Arg Aia Val Pro Asp Glu Glu Arg Gin Ala Gin Pro Phe Asp Pro 
1115 1120 1125 
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Gin Arg Phe Gin Asp Mb Val Val Met Pro Trp Tyr Arg Asn Gin 
1130 1135 1140 



Asp G!n Pro Gin Tyr Phe Tyr Val Ala GIu lie Cys Pro His Leu 
1145 1150 1165 



Ser Pro Leu Ser Cys Phe Pro Gly Asp Asn Tyr Arg Thr Phe Lys 
1160 1166 1170 



His Tyr Tyr Leu Val Lys Tyr Gly Leu Thr lie Gin Asn Thr Ser 
1175 1180 1185 



Gin Pro Leu Leu Asp Val Asp His Thr Ser Ala Arg Leu Asn Phe 
1190 1195 1200 



Leu Thr Pro Arg Tyr Val Asn Arg Lys Gly Val Ala Leu Pro Thr 
1205 1210 1215 



Ser Ser Glu GIu Thr Lys Arg Ala Lys Arg GIu Asn Leu GIu Gin 
1220 1225 1230 



Lys Gin lie Leu Val Pro GIu Leu Cys Thr Val His Pro Phe Pro 
1235 1240 1245 



Ala Ser Leu Trp Arg Thr Ala Val Cys Leu Pro Cys lie Leu Tyr 
1250 1266 1260 



53 



wo 01/68836 



PCT/USOl/08435 



Arg He Asn Gly Leu Leu Leu Ala Asp Asp lie Arg Lys Gin Val 
1265 1270 1275 



Ser Ala Asp Leu Gly Leu Gly Arg Gin Gin lie Glu Asp Glu Asp 
1280 1285 ' 1290 



Phe Glu Trp Pro Met Leu Asp Phe Gly Trp Ser Leu Ser Glu Val 
1295 1300 1305 



Leu Lys Lys Ser Arg Glu Ser Lys Gin Lys Glu Ser Leu Lys Asp 
1310 1315 1320 



Asp Thr lie Asn Gly Lys Asp Leu Ala Asp Val Glu Lys Lys Pro 
1325 1330 1335 



Thr Ser Glu Glu Thr Gin Leu Asp Lys Asp Ser Lys Asp Asp Lys 
1340 1345 1350 



Val Glu Lys Ser Ala lie Glu Leu lie lie Glu Gly Glu Glu Lys 
1355 1360 1365 



Leu Gin Glu Ala Asp Asp Phe lie Glu lie Gly Thr Trp Ser Asn 
1370 1375 1380 



Asp Met Ala Asp Asp lie AJa Ser Phe Asn Gin Glu Asp Asp Asp 
1385 1390 1395 



Glu Asp Asp Ala Phe His Leu Pro Val Leu Pro Ala Asn Val Lys 
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Phe Cys Asp Gin Gin Thr Arg Tyr Gly Sen Pro Thr Phe Trp Asp 
1415 1420 1425 



Val Ser Asn Gly Glu Ser Gly Phe Lys Gly Pro Lys Ser Ser Gin 
1430 1435 1440 



Asn Lys Gin Giy Gly Lys Gly Lys Ala Lys Gly Pro Ala Lys Pro 
1445 1450 1455 



Thr Phe Asn Tyr Tyr Asp Ser Asp Asn Ser Leu Gly Ser Ser Tyr 
1460 1465 1470 



Asp Asp Asp Asp Asn Ala Gly Pro Leu Asn Tyr Met His His Asn 
1476 1480 1486 



Tyr Ser Ser /^p Asp Asp Asp Val Ala Asp Asp lie Asp Ala Gly 
1490 1495 1500 



Arg lie Ala Phe Thr Ser Lys Asn Glu Ala Glu Thr lie Glu Thr 
1505 1510 1515 



Ala Gin Glu Val Glu Lys Arg Gin Lys Gin Leu Ser lie lie Gin 
1520 1525 1530 



Ala Thr Asn Ala Asn Glu Arg Gin Tyr Gin Gin Thr Lys Asn Leu 
1535 1540 1545 
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Leu lie Gly Phe Asn Phe Lys His Glu Asp Gin Lys Glu Pro Ala 
1550 1555 1560 



Thr lie Arg Tyr Glu Glu Ser lie Ala Lys Leu Lys Thr Glu He 
1565 1570 1575 



Glu Ser Gly Gly Met Leu Val Pro His Asp Gin Gin Leu Val Leu 
1580 1585 1590 



Lys Arg Ser Asp Ala Ala Glu Ala Gin Val Ala Lys Val Ser Met 
1595 1600 1605 



Met Glu Leu Leu Lys Gin Leu Leu Pro Tyr Val Asn Glu Asp Val 
1610 1615 1620 



Leu Ala Lys Lys Leu Gly Asp Arg Arg Glu Leu Leu Leu Ser Asp 
1625 1630 1635 



Leu Val Glu Leu Asn Ala Asp Trp Val Ala Arg His Glu Gin Glu 
1640 1645 1650 



Thr Tyr Asn Val Met Gly Cys Gly Asp Ser Phe Asp Asn Tyr Asn 
1655 1660 1665 



Asp His His Arg Leu Asn Leu Asp Glu Lys Gin Leu Lys Leu Gin 
1670 1675 1680 
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Tyr GIu Arg lie Glu lie Glu Pro Pro Thr SerThr Lys Ala lie 
1685 1690 1695 



Thr Ser Ala lie Leu Pro Ala Gty Phe Sen Phe Asp Arg Gin Pro 
1700 1705 1710 



Asp Leu Val Giy His Pro Gly Pro Ser Pro Ser lie He Leu Gin 
1715 1720 1725 



Ala Leu Thr Met Ser Asn Ala Asn Asp Gly lie Asn Leu Glu Arg 
1730 1735 1740 



Leu Glu Thr lie Gly Asp Ser Phe Leu Lys Tyr Ala lie Thr Thr 
1745 1750 1755 



Tyr Leu Tyr lie Thr Tyr Glu Asn Val His Glu Gly Lys Leu Ser 
1760 1765 1770 



His Leu Arg Ser Lys Gin Val Ala Asn Leu Asn Leu Tyr Arg Leu 
1776 1780 1785 



Gly Arg Arg Lys Arg Leu Gly Glu Tyr Met He Ala Thr Lys Phe 
1790 1795 1800 



Glu Pro His Asp Asn Trp Leu Pro Pro Cys Tyr Tyr Val Pro Lys 
1805 1810 1815 
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Glu Leu Glu Lys Ala Leu lie Glu Ala Lys He Pro Thr His His 
1820 1825 1830 



TrpLys Leu AJa Asp Leu Leu Asp lie Lys Asn Leu SerSerVal 
1835 1840 1845 



Gin He Cys Glu Met Val Arg Glu Lys A!a Asp Ala Leu Gly Leu 
1850 1855 1860 



Glu Gin Asn Gly Gly Ala Gin Asn Gly Gin Leu Asp Asp Ser Asn 
1865 . 1870 1875 



Asp Ser Cys Asn Asp Plie Ser Cys Phe lie Pro Tyr Asn Leu Vai 
1880 1885 1890 



Ser Gin His Ser lie Pro Asp Lys Ser lie Ala Asp Cys Val Glu 
1895 1900 1905 



Ala Leu lie Giy Ala Tyr Leu lie Glu Cys Gly Pro Arg Gly Ala 
1910 1915 1920 



Leu Leu Phe Met Ala Trp Leu Gly Val Arg Vat Leu Pro lie Thr 
1925 1930 1935 



Arg Gin Leu Asp Gly Giy Asn Gin Glu Gin Axg lie Pro Gly Ser 
1940 1945 1950 



Thr Lys Pro Asn Ala Glu Asn Vai Val Thr Val Tyr Gly Ala Trp 
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1955 1960 1965 



Pro Thr Pro Arg Ser Pro Leu Leu His Phe Ala Pro Asn Ala Thr 
1970 1975 1980 



Glu Glu Leu Asp Gin Leu Leu Ser Gly Phe Glu Glu Phe Glu Giu 
1985 1990 1995 



Ser Leu Gly Tyr Lys Phe Arg Asp Arg Ser Tyr Leu Leu Gin Ala 
2000 2005 2010 



Met Thr His Ala Ser Tyr Thr Pro Asn Arg Leu Thr Asp Cys Tyr 
2015 2020 2025 



Gin Arg Leu Glu Phe Leu Gly Asp Ala Val Leu Asp Tyr Leu lie 
2030 2035 2040 



Thr Arg His Leu Tyr Glu Asp Pro Arg Gin His Ser Pro Gly Ala 
2045 2050 2055 



Leu Thr Asp Leu Arg Ser Ala Leu Val Asn Asn Thr lie Phe Ala 
2060 2065 2070 



Ser Leu Ala Val Arg His Gly Phe His Lys Phe Phe Arg His Leu 
2075 2080 2085 



Ser Pro Gly Leu Asn Asp Val lie Asp Arg Phe Val Arg lie Gin 
2090 2096 2100 
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Gin Glu Asn Gly His Cys lie Ser Glu GIu Tyr Tyr Leu Leu Ser 
2105 2110 2115 



Glu Glu Glu Cys Asp Asp Aia Glu Asp Val Glu Val ProLysAla 
2120 2125 2130 



Leu Gly Asp Val Phe Glu Ser lie Ala Gly Ala lie Phe Leu Asp 
2135 2140 2145 



Ser Asn Met Ser Leu Asp Val Val Trp His Val Tyr Ser Asn Met 
2160 2155 2160 



Mel Ser Pro Glu lie Glu Gin Phe Ser Asn Ser Val Pro Lys Ser 
2165 2170 2175 



Pro lie Arg Glu Leu Leu Glu Leu Glu Pro Glu Thr Ala Lys Phe 
2180 2185 2190 



Gly Lys Pro Glu Lys Leu Ala Asp Gly Arg Arg Val Arg Val Thr 
2195 2200 2205 



Val Asp Val Phe Cys Lys Gly Thr Phe Arg Gly He Gly Arg Asn 
2210 2215 2220 



Tyr Arg lie Ala Lys Cys Thr Ala Ala Lys Cys Ala Leu Arg Gin 
2225 2230 2235 
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Leu Lys Lys Gin Gly Leu lie A!a Lys Lys Asp 
2240 2245 
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Although claims (2,4-15)-partiany (and as far as in vivo methods are 
concerned) are directed to a method of treatment of the human/animal 
body, the search has been carried out and based on the alleged effects of 
the compound/composition. 



Continuation of Box I.l 



Although claims 18 and 19 could be, at least be partially be considered 
as scheme, rule and method for doing buisness. Rule 39.1(iii) PCX, the 
search has been carried out as far as possible in our systematic 
documentation. 



Continuation of Box 1.2 
Claims Nos.: 17 



Present claim 17 relate to a method of formulating a pharmaceutical 
preparation including one or more agents identified by an assay, without 
giving a true technical characterization. Moreover no such compounds are 
defined in the application. In consequence, the scope of said claims Is 
ambiguous and vague, and their subject-matter Is not sufficiently 
disclosed and supported {Art. 5 and 6 PCI). No search can be carried out 
for such purely speculative claims whose wording is, in fact, a mere 
recitation of the results to be achieved. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions In respect of which no international 
search report has been established need not be the subject of an 
international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority Is normally not to carry out a 
preliminary examination on matter which has not been searched. This Is 
the case Irrespective of whether or not the claims are amended following 
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